DOCUMENT RESUME 



ED 419 354 



EC 306 443 



AUTHOR 

TITLE 

INSTITUTION 

SPONS AGENCY 

PUB DATE 
NOTE 

CONTRACT 
PUB TYPE 
EDRS PRICE 
DESCRIPTORS 

IDENTIFIERS 



Jensema, Carl 

Presentation Rate and Readability of Closed Captioned 
Television. Final Report. 

Institute for Disability Research and Training, Inc., Silver 
Spring, MD. 

Department of Education, Washington, DC. Office of 
Educational Technology. 

1997-06-30 
7 Op . 

H180G40037 

Reports - Research (143) 

MF01/PC03 Plus Postage. 

♦Captions; Hearing Impairments; ♦Program Content; 
♦Readability; ♦Television; Television Research 
♦Closed Captioned Television; ♦Presentation Rates; Time 
Delay 



ABSTRACT 



This report discusses the outcomes of a federally funded 
project that investigated the characteristics of the captions on captioned 
television programs. A sample of 183 captioned programs stratified by program 
type was selected and recorded. In addition, 22 captioned music videos were 
analyzed. Both roll-up and pop-on captions were analyzed. In the first part 
of the study, captions were edited to remove commercials and then processed 
by computer to get caption speed data. Caption rates among program types 
varied considerably, with cports and music specials having the slowest 
caption rates. The second part of the study determined the amount of editing 
being done to program scripts. Ten-minute segments from two different shews 
in each of 13 program categories were analyzed by comparing the caption 
script to the program audio. The percentage of script edited out ranged from 
0 percent to 19 percent. In the third part of the study, commonly used words 
in captioning and their frequency of appearance were analyzed. All words from 
all the programs in the study were combined into one large computer file. 

This file, which contained 834,726 words, was sorted and found to contain 
16,102 unique words. The following reports are appended: Presentation Speed 
and Vocabulary in Closed Caption Television; Closed-Caption Television 
Presentation Speed and Vocabulary (American Annals of the Deaf, October 1996 , 
Vol . 14:1); and Viewer Reaction to Different Captioned Television Speeds. (CR) 



*****#***********«««**** + ******«*.*««.********'** 4 '******* + *****«««*«**'*'*'**** + * + + 

* Reproductions supplied by EDRS are the best that can be made 

* from the original document. 



■n 

m 

as 



Q 

w 




Report 



for 



Presentation Rate and Readability 
of Closed 'Captioned Television 



Department of Education 
Technology Research 
CFDA 84.180G 



Federal Award Number 
H180G40037 



PERMISSION TO REPRODUCE AND 
DISSEMINATE THIS MATERIAL HAS 
BEEN GRANTED BY 




Carl Jensema, Ph.D. 

Principal Investigator 

Institute for Disability Research and Training, Inc. 
2424 University Boulevard West 
Silver Spring, MD 20902 
301-942-4326 

June 30, 1997 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) 



U.S. DEPARTMENT OF EDUCATION 

Office of Educational Research and Improve men! 

EDUCATIONAL RESOURCES INFORMATION 
jT CENTER (ERIC) 

[^h is document has been reproduced as 
received from the person or organization 
originating it 

□ Minor changes have been made to improve 
reproduction Quality 



e Points of view or opinions stated in this docu- 
ment do not necessarily represent official 
OERI position or policy. 



2 



Presentation Rate and Readability 
of Closed Caption Television 

Final Report 



Objective 1 - Establish Advisory Board 

This project had several consultants and a formal Advisory Panel. The consultants were: 

Dr. Patricia Koskinen - Professor - University of Maryland 
Dr. Jane Haugh - President - Center for Developing Learning Potentials 
Dr. Robert Wilson - Professor Emeritus - University of Maryland 
Jeff Hutchins - Vice President - VITAC 

The Advisory Panel members were: 

Dr. Robert R. Davila - President - National Technical Institute for the Deaf 
Martin Block - Vice President - VITAC 

Mardi Loetermann - Research Director - National Center for Accessible Media (WGBH) 

Brenda Battat - Deputy Executive Director - Self Help for the Hard of Hearing 

Judith Johnson - Professor - Gallaudet University 

Dr. Linda Gambrell - Associate Dean - University of Maryland 

The consultants were brought in as needed. The Advisory Panel had full-day meetings at 
least annually. 

Objective 2 - Establish measurement system 

Carl Jensema, with assistance from Drs. Koskinen, Wilson, and Haugh, investigated 
measurements of reading difficulty. Indices reviewed included Grammatik, Beta-Max's Reading 
Estimator software. Micro Power & Light Reading Estimator, and several other measures with 
which the consultants were familiar....In addition, attempts were made to establish our own 
reading scale based on caption word frequency. After months of work, the Advisory Panel 
advised us to abandon reading difficulty scales and focus on caption speed. Caption speed was 
simply defined as the number of words shown on a program during the specific times captions 
were shown. For example, a half-hour program may have captions on the screen only 17 minutes 
and 15 seconds. In calculating speed (in words per minute) the total number of words in all the 
captions was divided by 17.25. 

Objective 3 - Obtain / analyze off air data 

Data were obtained from 1 83 programs and 22 music videos through the following 
procedure: 



1. Tape television programs off air. • 



2. Run program through a HUBCAP decoder to strip captions from Line 21, process the raw 
caption code to obtain meaningful captions, attach a time code, and store them on a computer file. 

3. Import the file into Microsoft Excel, edit out commercials and other non-program material. 

4. Run the file through a custom analysis program to calculate statistics for the program. 

5. Enter program statistics in the master database. 

The data collected in this manner was analyzed and a report was written. This report was 
published in the October 1996 issue of the American Annals of the Deaf. 

The captions from all the programs were combined, sorted alphabetically, and collapsed 
into a frequency table. This frequency table became the basis for an article to be published in 
"Perspectives on Deafness and Education" in September, 1997. 

Jeff Hutchins at VITAC sent us Spanish caption scripts. We put considerable work into 
analyzing the Spanish word frequency in the same way we did for the English caption data we 
had. There should be a good journal article in this. A new sorting program was written to handle 
the special Spanish characters and the Spanish sort was done. The one remaining problem was 
how to combine similar words. For example, in English we combined plural forms (e.g. boy and 
boys were counted as a single unique word), but in Spanish there are many more extensions and 
decisions on combining need to be made. Unfortunately, work on this was not completed by our 
Spanish expert, Joe Robison, before he left the project to accept another job We will look for a 
Spanish language expert at one of the nearby universities and offer to give the data to them for 
development into a journal article. 

Objective 4 - Develop video materials 

Working with consultant Jeff Hutchins, three test videos were developed. The topics 
were "Nation's Capital", "Sailing", and "Space. Each video consisted of eight 30-second 
segments, each captioned at a different specific speed. The speeds used in this project were: 
96,110,126, 140, 156, 170, 186, and 200 words per minute. Each segment was separated from 
the next one by 10 seconds of blank screen. The blank screen allowed the respondents time to 
mark their score sheets. 

The video material was created by selecting posters related to the topics and moving a 
video camera over them to give the illusion of motion. The videos had no audio. Each video was 
captioned with the exact number of words needed to create the desired caption speed. For 
example, a 30-second segment at 140 words per minute would have exactly 70 words in it. 

Two additional segments on the topic of "Art" were made. These segments were for use 
as part of the instructions to the participants. 

Participants were given a spoken and written introduction, asked to respond to a 
demographic questionnaire, filled out an eye chart, and responded to the two practice "Art" 
segments. They then watched a total of 24 video segments, responding to each one using a five- 
point scale. 




4 



Objective 5 - Obtain / analyze child data 
Objective 6 - Obtain / analyze adult data 

Objectives 5 and 6 are combined because data collection from children and adults was 
done concurrently. Data was collected from residents of New York, Pennsylvania, New Jersey, 
West Virginia, Virginia, North Carolina, South Carolina, Florida, District of Columbia, and 
Maryland. A total of 578 subjects were used. Data analysis was done with a statistical package 
called Statview. The results were written up and have been submitted to the American Annals of 
the Deaf for publication. 

Objective 7 - Final report 

This manuscript is the final report. The three journal articles produced by the project are 
in the appendix of the report. 

Objective 8 - Dissemination 

Several hundred copies of the off-air paper were mailed to interested professionals. The 
paper was accepted for publication by the American Annals of the Deaf and published in their 
October, 1996 issue. A copy of the paper is attached to this report. 

The paper on caption word frequency was submitted to Perspectives on Education and 
Deafness at Gallaudet University. It was accepted for publication and will be in the September 
1997 issue. A copy of the paper is attached to this report. 

The paper on caption speed was submitted to the American Annals of the Deaf in June, 
1997. We fully expect to have it accepted for publication after the journal's review process is 
completed. A copy of the paper is attached to this report. 

The three journal articles will be made available on the IDRT web site, 

HTTP:/AVWW. BDRT.COM 

The paper on the analysis of off air captions was given at the CAID/CEASD convention in 
Minneapolis in June 1995 and at the TDI convention in Boston in July 1995. 

The caption speed paper will be given at the Telecommunications for the Deaf, 
Incorporated convention in Kansas City, Missouri on July 15, 1997. Preparations for this have 
been made and all that remains is actually giving the paper. 

Objective 9 - Administration 

All monthly reports have been submitted. The final project report is being submitted. 




5 



APPENDIX 



Presentation Speed and Vocabulary 
in Closed Captioned Television 



Carl Jensema, Ph.D. and Ralph McCann 



Institute for Disabilities Research and Training, Inc. 
1299 Lamberton Drive, Suite 200 
Silver Spring, MD 20902 



(301) 593-2690 v/tty 
(301) 593-9670 fax 

75032.2001@COMPUSERVE.COM e-mail 



December 1995 



0 



7 



Presentation Speed and Vocabulary 
in Closed Captioned Television 



Carl Jensema, Ph.D. and Ralph McCann 
Institute for Disabilities Research and Training, Inc. 

Introduction 

In 1972, WGBH in Boston did a unique experiment in which they open-captioned a cooking 
program called "The French Chef' featuring Julia Child. The success of this first attempt at 
captioning led WGBH to rebroadcast daily an open captioned version of "ABC World News 
Tonight" for hearing impaired people. During the 1970's this was the only regularly broadcast 
television program in America designed to be accessible to deaf people. It was wildly popular in the 
deaf community because it was the only televised news program they could understand. 

When WGBH began rebroadcasting the "ABC World News Tonight" there were no rules 
for captioning. Captioning policy was developed on a day to day basis as captioning problems 
arose. The guiding principle at that time was to make the program accessible to every deaf 
viewer, regardless of their individual reading ability. Since studies conducted by the Gallaudet 
University Office of Demographic Studies and others indicated that the average graduate from an 
educational program for hearing impaired students had about a third grade reading level, WGBH 
extensively edited the program dialogue. The number of words were cut by about a third and the 
reading level was cut from roughly the sixth grade level to the third grade level. All passive voice 
sentence construction was removed, nearly all idioms were removed, contractions were 
eliminated, clauses were converted into short declarative sentences, and even jokes and puns were 
changed if it was felt the hearing impaired audience would not understand them. 

These captioning techniques, which almost everyone now considers over-editing, 
continued for many years. Part of the reason for this was that deaf people were so delighted to 
have captions that they accepted almost anything thrown on the screen. As captioned television 
became more entrenched as a standard part of television services in the late 1980's, deaf people 
began to examine the quality of captioning more closely. In general, deaf people indicated they 
wanted access to whatever was spoken on the audio and that captioners should not play the role 
of censors. Caption companies have tended to interpret this as meaning deaf people want straight 
verbatim captioning. 

Counting both broadcast and cable, there are now roughly 100 hours of captioned 
television programs shown each day, yet we have no formal data on the characteristics of the 
captions on these programs. Are programs now captioned verbatim? How much editing is done? 
What is the caption presentation speed of programs currently being shown on television? How 
does this presentation speed vary with the type of program? These and other questions are 
addressed in the research study reported here. 




1 



8 



Method 



Recording 



Caption data for this study was obtained from a sample of television programs recorded 
off-air. Based on the recommendations of an advisory panel of captioning experts, a sample of 
183 programs stratified by program type was selected and recorded in late 1994. Table 1 gives a 
breakdown of the program types and number of programs selected for each. The programs varied 
from a half-hour to four hours, with the film "Gettysburg" being the longest. The programs 
represented a total of approximately 180 hours of air time. Recording was done using the cable 
television service in a number of different homes. The exception was for some movies shown 
over premium cable channels. It proved easier to rent the films from a local video store than to 
record them off the cable system. All recording was done on an ordinary consumer-quality 4- 
head videocassette recorder (VCR). 

In addition, the project staff gained access to 22 captioned music videos, each of which 
was between two and five minutes in length. These were analyzed separately because they were 
so different from the regular progra mmin g 



Table 1 



Sample of Programs 



Regular Programs 



Kids Anima tion 
Kids Educational 
Kids Action 
Prime Time Drama 
Situation Comedie 
F ilms 
News 

Documentaries 
Talk Shows 
Soap Operas 
Music Specials 
Sports 

Live Performances 
Total Programs 



N 

20 

11 

6 

26 

26 

21 

20 

17 

10 

9 

6 

6 

5 

183 



100 



% 

11 

6 

3 

14 

14 

11 

11 

9 

5 

5 

3 

3 

3 



Music Videos 



2 to 3 minute song 22 



Total 



205 



Data Extraction 

The videotapes which were obtained were replayed and the signal was run through a 
special closed caption decoder which read the captions from line-21 and fed them into a computer 
file. Special software was written to read the computer's clock and attach a start time and an end 
time to each line of caption data. This time-and-caption file was the basic raw data which was 
analyzed for each program 

Those programs which were recorded off commercial channels had advertisements, and 
even those on PBS or pay channels had station breaks or promotional material. All this non- 
program material had to be edited out of each data file. This was done by importing each data file 
into a spreadsheet and deleting the non-program parts, a lengthy and time consuming process. 

The result was a final "clean" data file for each program 

Time Analysis 

Analysis of the time data was much more complex than it might seem The captions and 
the control codes associated with them are transmitted in a steady binarily-coded stream in the 
television signal, but the actual appearance of captions on the screen is not necessarily exclusively 
sequential. There is a great deal of time overlap in the caption lines. 

There are two kinds of captions, each with different characteristics. Roll-up captions 
scroll up the screen, usually in a three-line format. As one line rolls off a new line rolls up. 
Although three lines are usually used, two line and four line captions are also possible. The roll 
usually has a steady speed, but the captioner can make it speed up or slow down as needed to 
keep up with the program audio. Pop on captions are blocks of words which may have anywhere 
from one to four lines. They pop onto the screen and pop off after a few seconds. There may be 
more than one block of pop on captions on the screen at one time. Figure 1 shows a schematic of 
how roll-up and pop on captions overlap in time. The words are transmitted as one long stream 
of data, but control codes in the data stream make the decoder divide the words into caption lines 
and these caption lines have an overlap in screen display time. 

The "clean" data files in this study were analyzed with a custom computer software 
program Table 2 gives a list of the information outputted by the computer program "Total time 
of program" is the actual time from when the program begins to when it ends, including break 
time and commercial time. It does not include co mm ercials or break time before and after the 
program "Total time of captions on screen" is the time during which program captions are 
present on the screen. It does not include break time, commercial time, or program time during 
which no captions are shown. All of the analysis in this study is based on "total time of captions 
on screen". 



Figure 1 

Schematic Representation of 
Caption Presentation Over Time 



Caption 

Line 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 



Roll-Up Captions 



> Time > 



Caption 

Line 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 



Pop On Captions 





4 



11 



Table 2 

Output from Caption Time Analysis Program 



Total time of program 
Total time of captions on screen 
Total # of caption lines 
Total # of words 
Total # of characters 
Mean caption lines per minute 
Mean # of words per line 
Mean # of characters per line 
Mean # of words per minute 
Mean characters per minute 



Editing Level 

Hearing impaired people have repeatedly indicated that they prefer verbatim captioning. 
They know they are not always getting perfect verbatim captioning because they sometimes see 
an actor speak a word or group of words for which there is no caption on the screen. The 
problem is that no one seems to know how much editing is done and how much is lost in the 
conversion from audio to captioning. In this study, 26 programs were randomly selected and for 
each program a sample of 10 minutes of audio was compared to the words which were captioned. 
The results were tabulated to give an indication of the percent of program audio which is usually 
captioned. 

Word Analysis 

What words are used in captioning? What is the frequency with which words appear in 
captions? To provide some insight into these questions, all the words in all the programs in this 
study were combined into one large computer file. This file, which contained 834,726 words, was 
sorted and the 16,102 unique words were arranged into a frequency table. 



Results and Discussion 



Program Characteristics 

A total of 205 programs were analyzed, 183 regular programs and 22 short (2-5 minute) 
music videos. Table 3 provides a breakdown of the programs by length. Overall, there were 
roughly 180 hours of video. 



Table 3 

Program Length 



Length 


Number of 
Programs 


5 minutes 


22 


.5 hour 


78 


1 to 1.5 hours 


75 


2 hours 


25 


over 2 hours 


5 


Total 


205 



Table 4 shows the number of programs in this study which were captioned by each of the 
major caption companies. However, it should not be assumed that the distribution of programs 
reflects the size of a caption company’s business. For example, VITAC captions the one-hour Jay 
Leno program included in this study, but it captions that program five nights a week. This is 
about 10 times as much business as captioning a weekly half-hour sitcom. 



Table 4 

Caption Companies 



Number of 
Programs 
Captioned 



Regular Programs 

NCI 113 

WGBH 45 

Captions, Inc. 9 

Vitae 8 

All Others 8 

Music Videos 

NCI 3 

WGBH 19 

Total 205 




6 



13 



Caption Speed 

Table 5 gives various breakdowns of caption statistics for the 183 programs analyzed. 

(The 22 short music videos will be discussed separately.) For each program grouping, the mean, 
standard deviation, maximum value, minimum value, and range are given for words-per-minute 
(WPM), characters-per-minute (CPM), characters-per-word, caption-lines-per-minute, words- 
per-line, and characters-per-line. Over all programs, the mean values were 141 WPM, 736 CPM, 
5.2 characters per word, 38.7 lines per minute, 3.7 words per line, and 19.2 characters per line. 
WPM and CPM are the two indexes usually used to measure caption speed. WPM has more 
intuitive meaning for most people, but it can be influenced by differences in word length. Figures 
2 and 3 present the mean WPM and CPM in graphic form The graphs for WPM and CPM are 
very similar in shape. 

There are two kinds of captions, popping and rolling. In this study, it was found that 
rolling captions generally present more words over a given period of time as compared to popping 
captions (151 WPM vs. 138 WPM), but that rolling captions are used for a wide range of audio 
speeds, from very slow (74 WPM) to very fast (231 WPM). 

Sports and music specials have the slowest caption rates. Sports tend to be visual in 
nature and most viewers are more interested in screen action than in the commentary. Music 
specials follow the pace of the music and the words to music are often sung more slowly than they 
would be spoken, resulting in a slower caption rate. Of course, there are exceptions, as will be 
seen in the discussion of music videos later. 

Children's progra mmin g also has a slow captioning rate, but that rate was faster than 
expected. For children's educational, a nima tion, and action programs, the rates were 124, 125, 
and 131 WPM, respectively. The overall mean for children's programs was 126 WPM. Program 
speed ranged from 87 WPM for "Sesame Street" to 154 WPM for "Bill Nye". There is clearly a 
trend toward faster caption rates for programs aimed at older children, but beyond that little is 
known about matching caption speed with the reading speed of children. Much more research is 
needed in this area. 

In the mid range of caption speed are live performances (137 WPM), documentaries (139 
WPM), films (140 WPM), prime time drama (146 WPM), and sitcoms (147 WPM). These kinds 
of programs tend to be clustered around the mean captioning speed of 141 WPM found over all 
183 programs analyzed. 

The categories of soaps (154 WPM), news (157 WPM), and talk shows (177 WPM) 
provided the fastest caption speeds. The mean speed for talk shows was increased by the presence 
of two late-night programs, "Later With Greg Kinnear" (23 1 WPM) and "Last Call" (229 WPM). 
Table 6 provides statistics for the programs with the five fastest and slowest caption speeds. The 
five fastest programs have more than twice the caption rate of the five slowest programs. 




7 



n a 



Table 5 

Caption Speed Statistics 



o 

ERIC 







<D 


3 

C 


*73 

t-. 

o 


c 

i 








•*-> 

3 

C 


i 


£ 


i— 

<u 

a. 








'i 


<5 

a. 


<5 

a 


W1 

<U 


-J 






u- 

a < 


co 

u 

<U 


(A 

U. 

<D 


c 

-J 


k. 

(1> 

□m 






i/i 






S 


t/l 








cs 

Urn 


Ua 


o 


*T3 

Urn 






o 

£ 


cc 

u 


CS 

JS 

U 


a. 

U 


o 


All Programs (n=183) 


Mean 


141 


736 


5.2 


38.7 


3.7 




St.Dev. 


21 


108 


0.2 


6.0 


0.5 




Maximum 


231 


1,171 


6.2 


55.3 


5.0 




Minimum 


74 


357 


4.7 


19.1 


2.8 




Range 


157 


814 


1.5 


36.2 


2.2 


Rolling Captions (n=48) 


Mean 


151 


781 


5.2 


34.8 


4.4 




St.Dev. 


31 


165 


0.2 


7.2 


0.3 




Maximum 


231 


1,171 


5.6 


55.3 


5.0 




Minimum 


74 


357 


4.8 


19.1 


3.4 




Range 


157 


814 


0.8 


36.2 


1.6 


Popping Captions (n=135) 


Mean 


138 


719 


5.2 


40.0 


3.5 




St.Dev. 


15 


73 


0.2 


4.9 


0.3 




Maximum 


177 


832 


6.2 


49.6 


4.4 




Minimum 


87 


463 


4.7 


24.4 


2.8 




Range 


89 


369 


1.5 


25.2 


1.6 


Talk Shows (n=10) 


Mean 


177 


897 


5.1 


40.4 


4.4 




St.Dev. 


30 


151 


0.1 


6.4 


0.3 




Maximum 


231 


1,171 


5.3 


55.3 


5.0 




Minimum 


142 


713 


4.9 


33.2 


4.1 




Range 


89 


458 


0.4 


22.1 


0.9 


Sports (n=6) 


Mean 


106 


535 


5.1 


23.2 


4.6 




St.Dev. 


15 


79 


0.1 


3.0 


0.2 




Maximum 


126 


645 


5.2 


26.3 


4.9 




Minimum 


88 


442 


4.9 


19.1 


4.1 




Range 


38 


203 


0.3 


7.2 


0.7 


Soaps (n=9) 


Mean 


154 


778 


5.1 


36.7 


4.2 




St.Dev. 


15 


72 


0.1 


3.3 


0.3 




Maximum 


178 


896 


5.2 


44.1 


5.0 




Minimum 


138 


696 


4.9 


33.1 


4.0 




Range 


40 


200 


0.3 


11.0 


1.0 



8 

15 



19 2 

2.7 

25.9 

14.0 

11.9 

22.5 

1.8 

25.9 

16.3 

9.6 

18.1 

2.0 

22.9 

14.0 
8.9 

22.2 

1.3 

24.6 

20.7 
40 

23.0 
1.2 

25.0 

21.4 

3.6 

21.2 

1.2 

24.3 

20.3 

4.0 



Characters Per Line 



Sitcom (n=26) 



Prime Time (n=24) 



News (n=20) 



Music Specials (n=6) 



Live Performances (n=5) 



O 




Table 5 (Continued) 
Caption Speed Statistics 







a; 

3 




C 






<u 


c 


ha 

o 








c 




£ 


4> 

Q. 






if 


w 

Ou 


U« 

<D 

CL 


c/1 

4> 


c 




L. 

a 


co 

W* 

0) 


C/1 

w 

4> 


C 

J 


u 

<u 

Cl 




cO 


« 


« 


c 


CO 




T3 


as 

w 


rt 

w 


o 


-a 

w 




o 

£ 


CS 

.c 

CJ 


C3 

-C 

cj 


a . 

w 

CJ 


o 

£ 


Mean 


147 


758 


5.2 


43.1 


3.4 


St.Dev. 


10 


51 


0.1 


3.8 


0.3 


Maximum 


162 


825 


5.4 


49.6 


4.0 


Minimum 


119 


593 


5.0 


35 3 


3.0 


Range 


43 


232 


0.4 


14.3 


1.1 


Mean 


146 


748 


5.1 


42.9 


3.4 


St.Dev. 


10 


52 


0.1 


3.5 


0.2 


Maximum 


164 


814 


5.4 


48.5 


3.9 


Minimum 


120 


605 


4.9 


35 6 


3.2 


Range 


45 


210 


0.5 


12.9 


0.7 


Mean 


157 


835 


5.3 


36.2 


4.3 


St.Dev. 


15 


86 


0.2 


4.1 


0.3 


Maximum 


183 


978 


5.7 


43.2 


4.9 


Minimum 


123 


652 


4.9 


28.7 


3.9 


Range 


60 


326 


0.7 


14 5 


1.0 


Mean 


107 


551 


5.2 


29.0 


3.7 


St.Dev. 


24 


135 


0.2 


8.1 


0.5 


Maximum 


144 


729 


5.4 


41.6 


4.5 


Minimum 


74 


357 


4.8 


19 2 


3.2 


Range 


70 


372 


0.6 


22.4 


1.3 


Mean 


137 


725 


5.3 


36.5 


3.7 


St.Dev. 


19 


88 


0.1 


2.6 


0.4 


Maximum 


156 


808 


5.4 


39 3 


4.4 


Minimum 


115 


623 


5.2 


34.4 


3.3 


Range 


41 


185 


0.3 


4.9 


1.1 




17 7 
13 
20 3 

15 5 
4 8 

17.5 

1.1 

196 

16.0 

3.5 

23.1 

1.5 
25.9 

20.7 
5.2 

19 2 

2.6 

22.4 

16 3 
6.1 

19 8 
19 

22.5 

17.8 
4.7 



Characters Per Line 



Table 5 (Continued) 
Caption Speed Statistics 



Kids Educational (n=10) 



Kids Animation (n=20) 



Kids Action (n=6) 



Film (n=22) 



Documentary (n= 1 7) 



O 






2 




0j 


3 
■ C 


"2 

o 


3 

C 


§ 


£ 


if 


u- 

Cl> 

CL 


u- 

<D 

£X 


CL 


CO 

Un 


on 

W 


Ui 


S 


tt 


*2 


2 


eg 


o 


eg 


eg 


£ 


-C 

U 


-C 

U 



CJ 

CL 



c 



c 

o 

zz 

CL 

eg 

U 



CD 

c 



CL 



£ 



Mean 


124 


667 


5.4 


346 


3.5 


18.7 


St Dev. 


18 


99 


02 


49 


0 3 


1.7 


Maximum 


154 


791 


5.7 


388 


4.1 


21.7 


Minimum 


87 


463 


5.0 


24.4 


3.1 


16.8 


Range 


66 


328 


0.7 


144 


1.0 


4.9 


Mean 


125 


660 


5.3 


394 


3.2 


16.8 


St.Dev. 


13 


61 


0.2 


3.9 


0 2 


1.0 


Maximum 


148 


784 


5.7 


46 3 


3.5 


19 0 


Minimum 


•105 


574 


4.9 


33.4 


2.9 


15.2 


Range 


43 


210 


08 


12.9 


0.6 


3.9 


Mean 


131 


685 


5.2 


40 2 


3.3 


17.0 


St.Dev. 


20 


101 


0.1 


50 


0.2 


1.4 


Maximum 


152 


788 


5.5 


45.7 


3.5 


19.1 


Minimum 


95 


494 


5.1 


33 2 


2.9 


14 9 


Range 


57 


294 


0.4 


126 


0.6 


4.2 


Mean 


140 


710 


5.1 


41 3 


3.4 


17.3 


St.Dev. 


13 


59 


0.2 


3 9 


0.4 


1.9 


Maximum 


177 


832 


5.4 


47.9 


4.2 


20.5 


Minimum 


121 


607 


4.7 


32.1 


2.8 


14.0 


Range 


56 


225 


0.7 


15 8 


1.4 


6.4 


Mean 


139 


766 


5.5 


35.7 


3.9 


21.6 


St.Dev. 


12 


43 


0.2 


3.4 


0.4 


1.7 


Maximum 


161 


829 


6.2 


45 6 


4.9 


25.4 


Minimum 


113 


698 


5.2 


31 0 


3.3 


18.1 


Range 


48 


131 


1.0 


146 


1.6 


7.3 



10 



17 



Characters Per Line 



Table 6 

Programs with Fastest and Slowest Caption Rates 



Type 



& 



H 



a 

o 



cu 

03 

cj 





3 


§ 




<u 


• 2 






3 

a 


s 


3 

CL 


4) 




3 


c/i 


.3 




CL 


03 

.3 






CO 


w 








4) 


CL 

CO 


CJ 


a 

o 


a. 

c/> 


"O 

o 


CO 

Un 

CO 


S. 


"O 

o 


£ 


43 

U 


CO 

CJ 


£ 


a 


a 


a 


g 


CO 


CO 


CO 


3 




<u 


a> 


4> 


s 




s 


2 



.3 



CL 

CO 

s 

o 

03 

3 



CJ 



a 

CO 






'p 

3 

£ 

S 

CL 



CO 

cj 



<u 

£ 



Fastest Programs 



Later w/Greg Kinnear 


Talk show 


roll 3 


231 


1171 


55 


4.2 


21 


5.1 


Last CaU 


Talk show 


roll 3 


229 


1134 


46 


5.0 


25 


5.0 


Connie Chung 


News 


roll 3 


183 


920 


38 


4.8 


24 


5.0 


Guiding Light 


Soap 


roll 3 


178 


870 


36 


5.0 


24 


4.9 


Meet the Press 


Talk show 


roll 3 


177 


930 


40 


4.4 


23 


5.3 






Mean 


199 


1005 


43 


4.7 


24 


5.0 


Slowest Programs 


















ABC Sports: Golf 


Sports 


roll 2 


94 


463 


20 


4.7 


23 


4.9 


TNT Basketball 


Sports 


roll 3 


88 


442 


19 


4.6 


23 


5.0 


Sesame Street 


Kids Educational 


pop 


87 


463 


27 


3.2 


17 


5.3 


Billboard Music Awards 


Music Special 


roll 3 


87 


430 


19 


4.5 


22 


5.0 


Whitney Houston 


Music Special 


roll 3 


74 


357 


22 


3.4 


16 


4.8 






Mean 


86 


431 


22 


4.1 


20 


5.0 




11 



18 



For comparison purposes, the mean WPM and CPM for various breakdown categories are 
presented in Figures 2 and 3. Since for most programs the number of characters per word does 
not vary greatly from the overall mean of 5.2 characters, the WPM and CPM graphs closely 
resemble each other in shape. The finding that word length does not vary greatly among programs 
is important. It had been suspected that programs considered more difficult to read might have a 
longer mean word length. This was not the case. For example, although "Sesame Street" is 
obviously easier to read than "Meet the Press", both have a mean word length of 5.3 characters. 

The music videos were analyzed as a separate category. Music videos were included in 
this study mostly as a matter of curiosity because they represent a unique kind of caption material. 
Figure 4 presents the caption speed for each of the 22 music videos. The speed varies from 60 to 
3 1 1 WPM, a much wider range than was found in the regular program categories. Many music 
videos flash images on the screen for a brief time. This makes captions harder to read because the 
viewer's attention is distracted. The fastest and most difficult to read captions were found in rap 
music. For example, the captions for the song "Freak It" proved impossible to understand 
without repeated viewing. 

Caption Editing 

For each of the program categories, two programs were selected and a 10-minute segment 
of each was carefully analyzed to see if there were any words spoken but not captioned. The 
results are given in Table 7. Several programs were 100% captioned. The most edited program 
was an ABC golf program where only 81% of the spoken words were captioned. This program 
was clearly an anomaly because it was captioned live and rolling captions were used, meaning that 
there were many times when captions could not be put on screen without covering up a player 
putting or a ball rolling toward a cup. 

Among the 26 programs, the average was 94% captioned. When the golf program was 
excluded, the average was 95% captioned. To take a closer look at the material being edited, two 
programs were selected and a word-by-word inspection was made. "Hanging with Mr. Cooper" 
was selected as the most edited (87% captioned) program with pop on captions. The NBC "Today" 
show was selected as an example of a highly edited (91% captioned) program with roll-up captions. 

Table 8 shows the changes made in a segment of the "Mr. Cooper" program. The first 
column gives the exact words which were spoken. The second col umn gives the words which 
were removed, the third column gives the words added, and the fourth column gives the actual 
captions which appeared on the screen. Most of the editing does not change the meaning of the 
text. The changes usually just provide a slight simplification of the sentence structure. The 
editing does not really seem necessary. Perhaps some of the changes were made because the 
captioner's supervisor gave instructions to caption at a certain WPM rate. For example, replacing 
"he likes to listen" with "he likes listening" changes the line from four words to three words, but it 
doesn't make the line shorter or easier to read. Another possibility is that the studio provided the 
captioner with a script and the captioner captioned the program verbatim, then the studio decided 
to go over the program again and "sweeten" the audio after it was captioned. 



12 





Figure 2 

Mean Words Per Minute 



r~- 

r~- 



O 

oo 







s 

a. 

e 

<9 

It 

s 



o 

cv 




o 

ERIC 



13 



Figure 3 

Mean Characters Per Minute 




ERIC 



14 





Figure 4 

Music Video Words per Minute 



Human Behavior 
Big Time Sensuality 
Evaleline 
Came Undone 
Into Your Arms 
One Caress 
Not Quite Sonic 
Blue 
My Sister 
Cannonball 
Rain Will Fall 
The Gift 
God 

Runaway Love 
Paying the Price c € Love 
Work for Food 
Pudi Th 1 Little Daisies 
Who Was in My Room Last Night? 

Classic Material 
Whatta Man 
What's Next 
Freak It 




0 50 100 



150 200 250 



300 




is 24 



Table 7 

Percentage of Audio Captioned 



Program 


Program 


Percent 


Type 


Title 


Captioned 


Soap 


The Bold and the Beautiful 


100 




Guiding Light 


100 


Documentary 


Wild America 


100 




Great Railroad Journey 


99 


Film 


Ace Ventura 


98 




Madame Butterfly 


97 


Talk Show 


David Letterman 


99 




Jay Leno 


96 


Live Performance 


Clio Awards 


97 




Seigfried and Roy 


95 


Prime Time 


Arly Hanks 


97 




ER 


94 


Music Special 


Whitney Houston 


100 




Billy Ray Cyrus Special 


91 


News 


ABC News 


98 




TODAY 


91 


Kids Action 


Power Rangers 


96 




California Dreams 


90 


Kids Animation 


Animaniacs 


97 




Batman - The Series 


89 


Kids Educational 


Kids Songs 


93 




Barney 


88 


Sitcom 


In Living Color 


91 




Hangin With Mr. Cooper 


87 


Sports 


CBS Sports: Figure Skatin 


90 




ABC Sports: Golf 


81 




16 



25 



Table 8 

Changes in "Mr. Cooper" 



Spoken 


Remove 


Add 


Caption 


TURN IT UP, I CANT HEAR ANYTHING. 


I CANT HEAR ANYTHING. 




TURN IT UP, 


SHH! HE'S ON THE PHONE. 






SHH! HE'S ON THE PHONE. 


COME ON, BABY. 






COME ON, BABY. 


YOU KNOW YOU DO NT HAVE 


YOU KNOW 




YOU DONT HAVE 


TO GO SHOPPING. 






TO GO SHOPPING. 


YOU KNOW WHAT BIG DADDY 






YOU KNOW WHAT BIG DADDY 


WANT FOR HIS BIRTHDAY. 






WANT FOR HIS BIRTHDAY. 


HOLD ON 






HOLD ON 


LET ME CALL YOU BACK, ALL RIGHT. 


ALL RIGHT 




LET ME CALL YOU BACK, 


WHAT DOES HE WANT? 






WHAT DOES HE WANT? 


HEY, BIG DADDY. 






HEY, BIG DADDY. 


WE'RE SORRY COUSIN MARK. 


COUSIN MARK 




WERE SORRY 


WE WERE JUST TRYING TO FIND OUT 


WERE JUST TRYING 


WANTED 


WE WANTED TO FIND OUT 


WHAT YOU WANTED 






WHAT YOU WANTED 


FOR YOUR BIRTHDAY. 






FOR YOUR BIRTHDAY. 


WELL YOU KNOW YOU TWO SHOULD NT 


WELL YOU KNOW TWO 




YOU SHOULDNT 


BE EAVESDROPPING. 






BE EAVESDROPPING. 


'CAUSE YOU NEVER KNOW 


CAUSE 




YOU NEVER KNOW 


WHAT YOU MIGHT HEAR, 






WHAT YOU MIGHT HEAR, 


LIKE HOW TYLER'S 






LIKE HOW TYLER'S 


PARENTS ARE SENDING HIM 


PARENTS ARE SENDING HIM 


BEING SENT 


BEING SENT 


TO MILITARY SCHOOL. 






TO MILITARY SCHOOL 


THE FEW, THE PROUD, 






THE FEW, THE PROUD, 


THE BIG-HEADED. 






THE BIG-HEADED. 


NOW WHERED YOU GET 


NOW 




WHERED YOU GET 


THE WALKIE-TALKIE? 






THE WALKIE-TALKIE? 


ITS A BABY MONITOR 






ITS A BABY MONITOR. 


MY DAD USES IT TO LISTEN IN 


MY 




DAD USES rr TO LISTEN IN 


ON THE BABYSITTER 






ON THE BABYSITTER. 


YOU MEAN YOUR BABY SISTER 


YOUR 




YOU MEAN BABY SISTER. 


NO. I MEAN THE BABYSITTER 






NO. I MEAN THE BABYSITTER. 


HE LIKES TO LISTEN 


TO 


ING 


HE LIKES LISTENING 


TO HER READ BEDTIME STORIES, 






TO HER READ BEDTIME STORIES 


OR AT LEAST UNTIL 


OR 




AT LEAST UNTIL 


MY MOTHER CAUGHT HIM. 






MY MOTHER CAUGHT HIM 


MOM REALLY RAKED IT IN 






MOM REALLY RAKED IT IN 


THIS CHRISTMAS. 






THIS CHRISTMAS. 


WELL, ALL RIGHT, GOMER 






WELL, ALL RIGHT, GOMER. 


GET OUT OF HERE 






GET OUT OF HERE 


AND TAKE PRIVATE BENJAMIN 






AND TAKE PRIVATE BENJAMIN 


WITH YOU. GET OUT. 


GET OUT. 




WITH YOU. 


KIDS AND THEIR TOYS. 






KIDS AND THEIR TOYS. 


THIS IS A GOOD WAY 






THIS IS A GOOD WAY 


FOR ME TO FIND OUT 






FOR ME TO FIND OUT 


WHAT TM GETTING 






WHAT TM GETTING 


FOR MY BIRTHDAY THOUGH. 






FOR MY BIRTHDAY THOUGH 


HI, MARK. 






HI, MARK. 


HEY! HEY. 


HEY 


HA HA 


HEY! HA HA 


WHAT’S IN THE BAGS, GIRLS? 






WHAT'S IN THE BAGS, GIRLS? 


UH, BIRTHDAY PLATES, 






UH, BIRTHDAY PLATES, 


PARTY CANDLES, 






PARTY CANDLES, 


60 OF MY CLOSEST 






60 OF MY CLOSEST 


FRIENDS, WHAT? 






FRIENDS, WHAT? 


GEE MARK, I DONT KNOW WHAT YOU 


I DONT KNOW 


ARE 


GEE MARK, WHAT ARE YOU 


TALKING ABOUT? 






TALKING ABOUT? 



Tables 9a, 9b, and 10 show two different kinds of editing for the "Today" program. This 
program is partly scripted and partly live. For the scripted part, the caption company is given a 
copy of the script before the show airs. They convert the script to captions and feed these 
captions into the broadcast at air time. The announcers on the screen see the same script on a 
teleprompter, but they do not always say exactly the same words that they read. The result is 
"editing" which is actually ad-libbing on the part of the announcers. Table 9a shows a scripted 
segment where several people are interacting. There is considerable ad-libbing. Table 9b shows a 
scripted segment which is straight news reporting, the announcer stays with the script and there is 
very little difference between the spoken and captioned versions. Table 10 shows a segment of 
"Today" which was captioned live by a stenocaptioner. There is a great deal of editing, but the 
essential information is still there. 

Word Analysis 

The caption scripts from all the programs in this study were combined into one large 
computer file. This file was edited to remove punctuation and anything else which was not a 
word. Certain non-standard "words", such as "uh", "mmmmm", and "ahhhh", were kept, since 
they are commonly used in captioning to indicate certain sounds in the audio. The resulting word 
list was sorted and arranged into a frequency table. The file had 843,726 words, of which 16, 102 
were unique. Just 10 words (the, you, to, a , I, and, of in, it, that) accounted for 176,793 of the 
834,726 words (2 1%). Half of all the words captioned were accounted for by just 79 unique 
words. Figure 5 gives a graph of the cumulative frequency of the 4,000 most frequent unique 
words. The horizontal axis gives the number of unique words and the vertical axis gives the 
percent of the entire word file accounted for by those unique words. Table 1 1 gives a list of the 
250 most frequent unique words. These words account for more than 2/3 of all words used in the 
captions in this study. 

For comparison, the frequency distributions of the words in about a dozen individual 
programs were examined. All the cumulative frequency graphs for these programs were very 
similar Figure 6 provides a cumulative frequency graph for the 678 unique words used in an 
episode of "Wings", a typical situation comedy. For comparison purposes, the graph also includes 
the cumulative frequency curve for the 678 most frequently used words among all programs. The 
"All Programs" line provides a lower bound for the frequency curve of any individual program, 
since it represents all unique words available among all programs in this study. 

In this instance, just 5 1 unique words accounted for half of all words used in the captions 
for this "Wings" episode and 174 words accounted for 75% of the words used. The important 
point is that captioned television (and by inference, the audio which the captions represent) use 
relatively few unique words. There are at least 500,000 words in the English language, but 
learning less than 500 words will cover most of the vocabulary in any television program shown in 
the United States today. 




18 



27 



Table 9a 

Changes in Scripted "Today" 



Spoken 


Remove 


Add 


Caption 


AND WELCOME TO "TODAY" 

ON THIS THURSDAY MORNING. 

I'M KATIE COURIC. 

AND I'M MATT LAUER. FILLING IN FOR 


FILLING IN FOR 




»> AND WELCOME TO "TODAY" 
ON THIS THURSDAY MORNING. 
I'M KATIE COURIC. 

» AND rM MATT LAUER, 


BRYANT GUMBELL WHO IS ON 


GUMBELL WHO 




BRYANT ISON 


VACATION THIS WEEK. 

AND MATT AHEAD IN OUR FIRST HALF 


AND MATT 




VACATION THIS WEEK. 

» AHEAD IN OUR FIRST HALF 


HOUR THIS MORNING, 


THIS MORNING 




HOUR, 


WE'RE GOING TO GET AN UPDATE 


RE GOING TO 


LL 


WELL GET AN UPDATE 


ON THE LATEST DEVELOPMENTS 

IN THE O.J. SIMPSON CASE 

AND HEAR WHAT NICOLE BROWN 

SIMPSON’S SISTER HAD TO SAY 

OUT SIDE THE COURTROOM 

WETL ALSO LOOK 

AT THE BIZARRE AND VERY TRAGIC 


VERY 




ON THE LATEST DEVELOPMENTS 
IN THE O.J. SIMPSON CASE 
AND HEAR WHAT NICOLE BROWN 
SIMPSONS SISTER HAD TO SAY 
OUTSIDE THE COURTROOM. 

WELL ALSO LOOK 

AT THE BIZARRE AND TRAGIC 


STORY OUT OF SWITZERLAND, 
WHERE 48 PEOPLE DIED 
IN A MASS SUICIDE 
MATT, AND ANOTHER SAD 


MATT 


VERY 


STORY OUT OF SWITZERLAND, 
WHERE 48 PEOPLE DIED 
IN A MASS SUICIDE. 

»AND ANOTHER VERY SAD 


STORY THIS MORNING - KATIE 


KATIE 




STORY THIS MORNING 


THE PARENTS OF A YOUNG AMERICAN BOY 


AMERICAN 




THE PARENTS OF A YOUNG BOY 


KILLED BY BANDITS IN ITALY 
A WEEK AGO TODAY. 

THEY DONATED HIS ORGANS 




ALL 


KILLED BY BANDITS IN ITALY 
A WEEK AGO TODAY. 

THEY DONATED ALL HIS ORGANS 


SO ITALIANS MIGHT LIVE. 

ALSO AHEAD ACTOR JOHN TRAVOLTA IS 


ALSO AHEAD IS 


WILL BE 


SO ITALIANS MIGHT LIVE. 

ACTOR JOHN TRAVOLTA WILL BE 


HERE TO TALK ABOUT 
HIS LATEST MOVIE, WHICH IS 
GETTING A LOT OF CRITICAL 
ACCLAIM, ITS CALLED "PULP FICTION." 


IT’S CALLED 




HERE TO TALK ABOUT 
HIS LATEST MOVIE, WHICH IS 
GETTING A LOT OF CRITICAL 
ACCLAIM, "PULP FICTION." 


BASEBALL GREAT MICKEY MANTLE 
WILL BE ALONG AND WELL 
LEARN SOME HEALTHY AND TASTY 
WAYS TO PREPARE SEAFOOD. 


SEAFOOD 


FISH 


BASEBALL GREAT MICKEY MANTLE 
WILL BE ALONG AND WETL 
LEARN SOME HEALTHY AND TASTY 
WAYS TO PREPARE FISH 


WHAT KIND OF SEAFOOD? 

I THINK TODAY WERE DOING 
STEAMED SHRIMP AND YOURE GO 
TO HELP. 

I AM, rM GONNA BE YOUR SOUSCHEF. 


AM, IM GONNA 


WILL 


» WHAT KIND OF SEAFOOD? 

» I THINK TODAY WE’RE DOING 
STEAMED SHRIMP AND YOURE GO 
TO HELP. 

» I WILL BE YOUR SOUSCHEF. 


YOURE THE STEAMER. 

OK, BUT LET’S GET STARTED 


YOURE THE STEAMER. 
OK, BUT GET STARTED 


GOTO 


LETS GO TO 


WITH THE MORNING’S 
TOP NEWS STORY OVER AT 
THE NEWSDESK 

AND FOR THAT WE WILL TURN TO 


WITH THE MORNING’S 
TOP NEWS STORY OVER AT 

FOR THAT 




THE NEWSDESK 
AND WE WILL TURN TO 


ELIZABETH VARGAS. 

GOOD MORNING, KATIE AND MATT. 


KATIE AND MATT. 




ELIZABETH VARGAS. 
» GOOD MORNING, 


GOOD MORNING, EVERYONE. 


GOOD MORNING 




EVERYONE. 


JURY SELECTION WILL BE 
ON THE SIDELINES AGAIN TODAY 
AT THE O.J. SIMPSON TRIAL 






»> JURY SELECTION WILL BE 
ON THE SIDELINES AGAIN TODAY 
AT THE O.J. SIMPSON TRIAL 


IN THE CONTINUING DEBATE 
OVER EVIDENCE TAKEN 
FROM SIMPSON’S CAR. 






IN THE CONTINUING DEBATE 
OVER EVIDENCE TAKEN 
FROM SIMPSON’S CAR. 



O 




19 23 



Table 9b 

Changes in Scripted "Today" 



Spoken 


Remove 


Add 


Caption 


THE GRIM SEARCH 






»> THE GRIM SEARCH 


CONTINUES THROUGH THE RUINS 






CONTINUES THROUGH THE RUINS 


OF BURNED-OUT HOMES 


HOMES 


HOUSES 


OF BURNED-OUT HOUSES 


CN SWITZERLAND. 






IN SWITZERLAND. 


IT'S THE AFTERMATH 






IT'S THE AFTERMATH 


OF AN APPARENT MASS SUICIDE 






OF AN APPARENT MASS SUICIDE 


BY MEMBERS OF A DOOMSDAY 






BY MEMBERS OF A DOOMSDAY 


CULT THAT HAS LEFT AT LEAST 






CULT THAT HAS LEFT AT LEAST 


50 PEOPLE DEAD 






50 PEOPLE DEAD 


IN SWITZERLAND AND IN CANADA 


IN 




IN SWITZERLAND AND CANADA 


DETAILS NOW FROM NBC’S 


NOW 




DETAILS FROM NBC'S 


KEITH MILLER. 






KEITH MILLER. 


THE POLICE SAY THE DEATH 






» THE POLICE SAY THE DEATH 


TOLL COULD GO HIGHER. 






TOLL COULD GO HIGHER. 


INVESTIGATORS WAITED UNTIL 






INVESTIGATORS WAITED UNTIL 


THIS MORNING TO SEARCH 






THIS MORNING TO SEARCH 


A BURNT -OUT SKI CHALET 


T 


ED 


A BURNED-OUT SKI CHALET 


FEARING IT COULD BE 






FEARING IT COULD BE 


BOOBY-TRAPPED 






BOOBY-TRAPPED 


A RELIGIOUS SECT CALLED 






A RELIGIOUS SECT CALLED 


THE ORDER OF THE SOLAR 






THE ORDER OF THE SOLAR 


TEMPLE IS BEHIND, WHAT 






TEMPLE IS BEHIND, WHAT 


POLICE CALL, A BIZARRE 






POLICE CALL, A BIZARRE 


RITUAL SLAUGHTER. 






RITUAL SLAUGHTER. 


23 BODIES WERE FOUND IN THIS 






23 BODIES WERE FOUND IN THIS 


BURNED-OUT FARMHOUSE 






BURNED-OUT FARMHOUSE 


IN THE VILLAGE OF CHEIRY, 






IN THE VILLAGE OF CHEIRY, 


80 MILES NORTHEAST 






80 MILES NORTHEAST 


OF GENEVA. 






OF GENEVA 


ANOTHER 23 BODIES WERE 






ANOTHER 25 BODIES WERE 


DISCOVERED IN THREE SKI 






DISCOVERED IN THREE SKI 


CHALETS 90 MILES AWAY. 






CHALETS 90 MILES AWAY. 


MASS SUICIDE IS POSSIBLE 






MASS SUICIDE IS POSSIBLE 


SO IS MURDER. 






SO IS MURDER 


TWENTY OF THE VICTIMS 






TWENTY OF THE VICTIMS 


IN THE FARMHOUSE HAD BEEN 






IN THE FARMHOUSE HAD BEEN 


SHOT. 






SHOT. 


MOST OF THE BODIES WERE 






MOST OF THE BODIES WERE 


FOUND IN AN UNDERGROUND ROOM 






FOUND IN AN UNDERGROUND ROOM 


THAT MAY HAVE BEEN USED 






THAT MAY HAVE BEEN USED 


FOR RELIGIOUS RITUALS. 






FOR RELIGIOUS RITUALS. 


EVERYTHING LOOKED LIKE 






» EVERYTHING LOOKED LIKE 


LIKE PEOPLE LIKE IN A WAX MUSEUM 


LIKE 




PEOPLE LIKE IN A WAX MUSEUM. 


SIMILAR CIRCUMSTANCES 






» SIMILAR CIRCUMSTANCES 


SURROUNDED THE DEATHS OF TWO 






SURROUNDED THE DEATHS OF TWO 


PEOPLE NEAR MONTREAL 






PEOPLE NEAR MONTREAL 


ON TUESDAY. 






ON TUESDAY. 


THEY WERE DISCOVERED 






THEY WERE DISCOVERED 


IN THE BURNT-OUT DUPLEX 






IN THE BURNT-OUT DUPLEX 


ADJACENT TO THE ONE OWNED 






ADJACENT TO THE ONE OWNED 


BY THE SECTS LEADER. 






BY THE SECTS LEADER, 


LUC JOURET. 






LUC JOURET. 




20 



9 



Table 10 

Changes in Live "Today 



CO 



o 

ERIC 



C 

o 



a 

n 

V 




-a 

o 

-a 

-a 

< 



> 

o 

£ 

ai 



o 

z 



< 

H 



co 

a 



gj 

$ 



3 



3 

s 

Q 



O 

co 



Z 

o 



C/5 

a 

z 

o 

a 



CO 

CO 

a 

u 

CO 



o a 
z 3 

t > 



CO 

b 

z 

o 

a 

3 



o 

CO 

Z 

3 



3§ 8 



CO 

O 


i 


CO 

uS 


a 


h 

> 


H 


5 

03 


1 2 


u 

a 

2 


3 

8 » 



o < 

uu uu 
a Ul 
a a 



2 

>* 



a 

CO 



g p 




ac 

2 

z 

> 



z a 

< 32 

z a 
c a 
o o 

CO > 




21 



O 

CO 



HAT WOULD HI 



Figure 5 

Cumulative Frequency Percentage for 
4000 Most Frequent Unique Words 



co 

CO 




22 



Words 



Table 11 

Frequently Used Words 



O 

ERIC 



Word 


Freq. 


Percent 


Word 


Freq. 


Percent 


Word 


Freq, 


Percent 


Word 


Freq. 


Percent 


THE 


30142 


3.61 


FROM 


2373 


46.27 


TOO 


1048 


58.06 


THOUGHT 


652 


64.23 


YOU 


22600 


6.32 


THAT'S 


2343 


46 55 


DIDN'T 


1040 


58 18 


BELIEVE 


650 


64.31 


TO 


22161 


8.97 


LOOK 


2324 


46.83 


HA 


1034 


58.31 


BOY 


646 


64.38 


A 


20023 


11.37 


HIM 


2316 


47.1 


NEW 


1023 


58.43 


THREE 


644 


64.46 


I 


19991 


13.77 


YOU’RE 


2285 


47.38 


TALK 


1020 


58.55 


EVERY 


641 


64.54 


AND 


16130 


15.7 


TIME 


2243 


47.65 


INTO 


1012 


58.67 


CAPTION 


639 


64.61 


OF 


13914 


17.37 


WHEN 


2231 


4791 


WORK 


1007 


58.79 


EVER 


639 


64.69 


IN 


10941 


18 68 


SEE 


2230 


48 18 


PLAY 


1006 


58.91 


SHOW 


636 


64.77 


IT 


10496 


19.93 


HOW 


2214 


48.45 


TRY 


998 


59.03 


AWAY 


635 


64.84 


THAT 


10395 


2118 


SAY 


2200 


48.71 


MUCH 


988 


59.15 


ALWAYS 


626 


64.92 


IS 


8764 


22.23 


GOOD 


2155 


48.97 


GUY 


987 


59.27 


ANYTHING 


607 


64.99 


THIS 


7116 


23.08 


BY 


2115 


49.22 


I'VE 


980 


59.39 


AM 


598 


65.06 


FOR 


6679 


23 88 


HAD 


2041 


4947 


UH 


976 


59.5 


LONG 


593 


65.13 


ON 


6411 


24.65 


YEAH 


1971 


49.7 


MEAN 


954 


59.62 


ASK 


587 


65.2 


WAS 


5945 


25.36 


AN 


1968 


49.94 


THERE’S 


954 


59.73 


TODAY 


587 


65.27 


HAVE 


5804 


26.06 


WOULD 


1899 


50.17 


ONLY 


938 


59.84 


NAME 


583 


65.34 


ME 


5740 


26.75 


DID 


1804 


50.38 


GIVE 


924 


59.96 


RUN 


583 


65.41 


WE 


5521 


27.41 


TAKE 


1794 


50.6 


OFF 


920 


60.07 


PLACE 


581 


65.48 


WHAT 


5464 


28.06 


WERE 


1765 


50.81 


ANY 


917 


60.18 


STOP 


580 


65.55 


BE 


5449 


28.71 


MAKE 


1757 


51.02 


FEEL 


907 


60.28 


WHICH 


570 


65.62 


HE 


5218 


29.34 


BACK 


1739 


51 23 


THESE 


905 


60.39 


SORRY 


566 


65.69 


WITH 


4895 


29.93 


WHO 


1719 


51 43 


GREAT 


884 


60.5 


FRIEND 


564 


65.76 


MY 


4834 


30.5 


BEEN 


1707 


51.64 


LETS 


884 


60.6 


BETTER 


563 


65.82 


YOUR 


4385 


31 03 


HAS 


1697 


5 1 84 


PREPARE 


871 


60.71 


THROUGH 


562 


65.89 


DO 


4375 


31.55 


THEM 


1599 


52.03 


LET 


863 


6081 


HOUSE 


559 


65.96 


I'M 


4258 


32.06 


OR 


1553 


52.22 


LIFE 


859 


60 91 


DOES 


558 


66.02 


ARE 


4224 


32.57 


SOME 


1547 


52.4 


OTHER 


852 


61.02 


FAMILY 


555 


66.09 


ALL 


4129 


33.07 


MAN 


1529 


52.59 


NIGHT 


831 


61.12 


KIND 


554 


66.16 


NOT 


4117 


33.56 


VERY 


1510 


52.77 


THEY'RE 


829 


61.22 


MAY 


551 


66.22 


ITS 


4111 


34.05 


OUR 


1475 


52.94 


HELP 


805 


61.31 


MOST 


548 


66.29 


KNOW 


3962 


34.53 


DOWN 


1474 


53 12 


HAPPEN 


802 


61.41 


GOD 


530 


66.35 


NO 


3890 


34.99 


THING 


1456 


53.3 


WHATS 


800 


61.5 


WOMAN 


524 


66.41 


BUT 


3885 


35.46 


WAY 


1431 


53.47 


THOSE 


784 


61.6 


MANY 


512 


66.48 


DON'T 


3859 


35.92 


YEAR 


1420 


53.64 


THAN 


782 


61.69 


HI 


510 


66.54 


GET 


3739 


36.37 


PEOPLE 


1409 


53.81 


FIND 


776 


61.78 


NOTHING 


509 


66.6 


THEY 


3612 


36.8 


COULD 


1408 


53.97 


LAST 


760 


61.88 


NEXT 


508 


66.66 


LIKE 


3436 


37.21 


MORE 


1383 


54 14 


WORLD 


760 


61.97 


MOVE 


503 


66.72 


SO 


3425 


37.62 


US 


1381 


54 31 


AFTER 


756 


62.06 


ANOTHER 


499 


66.78 


JUST 


3300 


38.02 


I'LL 


1369 


54.47 


SHE'S 


743 


62.15 


came 


498 


66.84 


AT 


3295 


38.41 


YES 


1364 


54.63 


MR 


741 


62.24 


TONIGHT 


495 


66.9 


HERE 


3197 


38.8 


HE'S 


1359 


54.8 


EVEN 


740 


62.32 


LEFT 


493 


66.96 


OUT 


3117 


39.17 


THANK 


1352 


54.96 


HOME 


735 


62.41 


TURN 


484 


67.02 


UP 


3074 


39.54 


LITTLE 


1351 


55.12 


AGAIN 


727 


62.5 


DOESN’T 


483 


67.07 


ABOUT 


3031 


39.9 


LOVE 


1340 


55.28 


MADE 


719 


62.59 . 


I'D 


482 


67.13 


ONE 


2998 


40.26 


WHY 


1278 


55.43 


BIG 


718 


62.67 


NEITHER 


481 


67.19 


RIGHT 


2906 


40.61 


REALLY 


1263 


55.58 


DOING 


718 


62.76 


MUST 


476 


67.25 


COME 


2904 


40.95 


TELL 


1256 


55.73 


PLEASE 


712 


62.84 


KILL 


472 


67.3 


THERE 


2886 


41.3 


OVER 


1249 


55.88 


PUT 


711 


62.93 


HAND 


470 


67.36 


OH 


2781 


41.63 


CALL 


1241 


56.03 


LOT 


709 


63.01 


STAY 


468 


67.41 


CAN 


2772 


41.97 


CAN’T 


1192 


56 18 


SHOULD 


700 


63.1 


WATCH 


467 


67.47 


IF 


2751 


42.3 


WHERE 


1179 


56.32 


BEFORE 


694 


63.18 


YOU'VE 


467 


67.53 


WANT 


2730 


42.62 


SAID 


1169 


56.46 


AROUND 


688 


63.26 


CHILDREN 


465 


67.58 


AS 


2714 


42.95 


DAY 


1163 


56.6 


WAIT 


688 


63.34 


HEAR 


463 


67.64 


NOW 


2696 


43.27 


NEVER 


1158 


56.74 


STILL 


687 


63.43 


HOPE 


462 


67.69 


SHE 


2686 


43.59 


SOMETHING 


1158 


56.87 


START 


684 


63.51 


MOTHER 


455 


67.75 


THINK 


2606 


43.9 


WE’RE 


1155 


57.01 


LIVE 


680 


63.59 


NICE 


455 


67.8 


HER 


2591 


44.22 


THEN 


1140 


57.15 


USE 


675 


63.67 


REMEMBER 


454 


67.86 


GO 


2584 


44.52 


TWO 


1133 


57.28 


SURE 


674 


63.75 


OWN 


453 


67.91 


WILL 


2522 


44.83 


BECAUSE 


1115 


57.42 


KEEP 


671 


63.83 


WONT 


451 


67.96 


WELL 


2442 


45.12 


THEIR 


1089 


57.55 


SIR 


670 


63.91 


MORNING 


449 


68.02 


GOING 


2428 


45.41 


HEY 


1087 


57.68 


OLD 


667 


63.99 


EVERYTHING 


446 


68.07 


HIS 


2409 


45.7 


FIRST 


1065 


57 81 


MAYBE 


657 


64.07 








GOT 


2375 


45.98 


NEED 


1049 


57.93 


WE'LL 


653 


64.15 









23 3 4 



Figure 6 

Cumulative Frequency Percentage 
for "Wings” and "All Programs" 



( 





VO 



o 

vO 



vO 



CN 



00 

rr 



rr 

rr 



o 

rr 



vO 

m 



<N 

m 



oo 

<N 



rr 

<N 



O 

(N 



vO 



CN 



00 



o o o o 

o os oo r- 



o o o 
O rr 



o o o o 

CO CN <— • 



juaojaj 9Ai)epuan3 



co 

oo 



CM 

13 

u 

O 

£ 



LfO 

CO 



24 





Complete Contents 273 FOUNDED 1.847 



A Professional Journal 
Dedicated to Quality 

Deaf and Hearing Parents* Interactions with Eldest Hearing Children 278 in Education and 

in Related Services 
for Children and 
Adults Who are Deaf 

Closed-Captioned Television Presentation Speed and Vocabulary 284 ANI) Haki)oi: hearing 



Whole Language and Deaf Bilingual-Bicultural Education — Naturally! 293 



Leadership Personnel Needs in the Education of Deaf and 

Hard of Hearing Children: Results of Two National Surveys 299 



Attitudes of Teachers and Parents in India Toward Career Choices 

for Deaf and Hearing People 303 

Essential Practices as Adults Read to Meet the Needs of 

Deaf or Hard of Hearing Students 309 




37 





Volume 141, No. 4 

Closed-Caption ed Television 
Presentation Speed and Vocabulary 




his study summarizes an extensive research project on closed-captioned television. 
Caption data were recorded from 205 television programs. Both roLL-up and pop-on 
captions were analyzed. In the first part of the study, captions were edited to remove 
commercials and then processed by computer to get caption speed data. Caption 
rates among program types varied considerably. The average caption speed for all 
programs was l4l words per minute, w'ith program extremes of 74. and 231 words 
per minute. The second part of the study determined the amount of editing being 
done to program scripts. Ten-minute segments from two different shows in each of 
13 program categories were analyzed by comparing the caption script to the program 
audio. The percentage of script edited out ranged from 0% (in instances of verbatim 
captioning) to 19%- In the third part of the study, commonly used words in 
captioning and their frequency of appearance were analyzed. All words from all the 
programs in the study were combined into one large computer file. This file, which 
contained 834,726 words, was sorted and found to contain 16,102 unique words. 



Carl Jensema 
Ralph McCann 
Scott Ramsey 



Jensema is vice 
president of the 
Institute for 
Disabilities Research 
and Training, Inc., 
Silver Spring, 
Maryland. McCann 
and Ramsey are 
research assistants 
with the Institute. 




In 1972, public television station WGBH in 
Boston did a unique experiment in which 
The French Chef, a cooking program fea- 
turing Julia Child, was open-captioned. 
The success of this first attempt at 
captioning led WGBH to rebroadcast daily 
an open captioned version of ABC World 
News Tonight for deaf and hard of hearing 
people. During the 1970s, this was the only 
regularly broadcast television program in 
America designed to be accessible to deaf 
people. It was wildly popular in the deaf 
community because it was the only tele- 
vised news program deaf people could un- 
derstand. 

When WGBH began rebroadcasting 
ABC World News Tonight , there were no 
rules for captioning. Captioning policy de- 
veloped on a day-to-day basis as 
captioning problems arose. The guiding 
principle at that time was to make the pro- 



gram accessible to every deaf viewer re- 
gardless of reading ability. Because studies 
conducted by the Gallaudet University Of- 
fice of Demographic Studies (Jensema, 
Schildroth, and O’Rourke, 1975; Trybus 
and Karchmer, 1977; Jensema and Trybus, 
1978) indicated that the average graduate 
from an educational program for deaf and 
hard of hearing students read at about a 
third-grade level, WGBH extensively ed- 
ited the program script. The word count 
was cut by about a third and the reading 
level was cut from roughly the sixth-grade 
level to the third-grade level. All passive- 
voice sentence constaiction was removed, 
nearly all idioms were removed, contrac- 
tions were eliminated, clauses were con- 
verted into short declarative sentences, and 
even jokes and puns were changed if it 
was felt the deaf and hard of hearing au- 
dience would not understand them. These 



Volume 141, No. 4 



38 



Table 1 

Programs Selected for Study by Type and Number 



Children’s animation 


20 


11 


Children’s educational 


* 11 


6 


Children’s action 


6 


3 


Prime-time dramas 


26 


14 


Situation comedies 


26 


14 


Films 


21 


11 


News 


20 


11 


Documentaries 


17 


9 


Talk shows 


10 


5 


Soap operas 


9 


5 


Music specials 


6 


3 


Sports 


6 


3 


Live performances 


5 


3 


Total Programs: 


183 


*98 


# u s i c Mi a e o ks? ' v- 




^ pimp 


2- to 5-minute songs 


22 




Total number of programs: 


205 




‘Percentage sums to less than 100 because of rounding. 



captioning techniques, which almost 
everyone now considers overediting, 
continued for many years. Part of the 
reason for this was that deaf people 
were so delighted to have captions that 
they accepted almost anything thrown 
on the screen. 

As captioned television became a 
standard part of television services in 
the late 1980s, deaf people began to 
examine the quality of captioning more 
closely. Deaf viewers wrote letters to 
caption companies indicating they 
wanted access to whatever was spoken 
on the audio and that captioners 
should not play the role of censors. 
According to conversations with 
captioning company officials, caption 
companies have tended to interpret 
this as meaning deaf people want 
straight verbatim captioning. 

Counting both broadcast and cable, 
about 100 hours of captioned television 
programs are shown on national televi- 
sion in the United States each day, yet 
heretofore no formal data on the char- 
acteristics of the captions on these pro- 
grams have been collected. Are 
programs now captioned verbatim? 
How much editing is done? What is the 
caption presentation speed of programs 
currently being shown on television? 
How does this presentation speed vary 
with the type of program? These and 
other questions are addressed in the 
research study reported here. 

Method 

Recording 

Caption data for the present study 
were obtained from a sample of tele- 
vision programs recorded as they were 
telecast. A ten-member advisory panel 
met to select and analyze programs to 
be studied. This panel consisted of: Dr. 
Robert Davilla, New York School for 
the Deaf; Dr. Judy Johnson, Gallaudet 
University; Ellie Korres, Gallaudet Uni- 
versity; Mardi Loetermen, WGBH; Beth 



Nubbe, NCI; Judith Brentano, The Cap- 
tion Company; Martin Block, VITAC; 
Brenda Battat, SHHH; Dr. Linda 
Gambrell, University of Maryland; and 
JoAnn McCann, U.S. Department of 
Education. Jeff Hutchins of VITAC was 
the technical consultant for the project. 
Based on the recommendations of 
these captioning experts, a sample of 
183 programs stratified by program 
type was selected and recorded in late 
1994. Table 1 provides a breakdown of 
the program types and the number of 
programs selected for each type. The 
programs varied in length from a half- 
hour to about four hours, with the film 
Gettysburg being the longest. The pro- 
grams represented a total of approxi- 
mately 180 hours of airtime. Recording 
was done using the cable television 
service in a number of different homes. 
The exception was for some movies 
shown over premium cable channels. 
It proved easier to rent the films from 



a local video store than to record them 
from the cable system. All recording 
was done on a consumer-quality four- 
head videocassette recorder (VCR). In 
addition, the project staff gained access 
to 22 captioned music videos, each of 
which was between two and five min- 
utes long. These were analyzed sepa- 
rately because they were so different 
from the regular programming. 

Data Extraction 

The videotapes were replayed with the 
signal being run through a special 
closed-caption decoder which read the 
closed-caption information from line 
21 of the vertical blanking interval and 
fed that data into a computer file. Spe- 
cial software was written to read the 
computer’s clock and attach a start 
time and an end time to each line of 
caption data. This time-and-caption file 
was the basic raw data analyzed for 
each program. 



ERJC 



41. No. 4 



.. 285 ) 



39 



American Annals < i hi 1 H •“ 




CCTV Presentation Speed and Vocabulary 



Programs recorded from the com- 
mercial networks and pay channels 
had advertisements, and even those on 
the PBS network were occasionally in- 
terrupted by station breaks or promo- 
tional material. All of this nonprogram 
material was edited out of each data 
file. This was done by importing each 
data file into a spreadsheet and delet- 
ing the nonprogram parts, a lengthy 
and time consuming process. The re- 
sult was a final “clean’ 1 data file for 
each program. 

Time Analysis 

Analysis of the time data was much 
more complex than it might seem. 
Captions and the control codes associ- 
ated with them are transmitted in a 
steady, binarily coded stream in the 
television signal, but the actual appear- 
ance of captions on the screen is not 
necessarily exclusively sequential. 
There is much time overlap in the cap- 
tion lines. 

There are two kinds of captions, 
each with different characteristics. Roll- 
up captions scroll up the screen, usually 
in a three-line format. As one line rolls 
off the top, a new line rolls up from the 
bottom. Although three lines are usually 
used, two-line and four-line captions 
also are possible. The roll usually has a 
steady speed, but the captioner can in- 
crease or decrease it as needed to keep 
up with the program audio. Pop-on 
captions are blocks of words consisting 
of one to four lines. These captions pop 
onto the screen and pop off after a few 
seconds. There may be more than one 
block of pop-on captions on the screen 
at one time. For both kinds of captions, 
the words are transmitted as one long 
stream of data, but control codes in the 
data stream make the decoder divide 
the words into caption lines, which 
sometimes have an overlap in screen 
display time. 

The “clean" data files in this study 
were analyzed with a custom computer 
software program. Ten kinds of infor- 
mation were outputted by the com- 
puter program. The two most 
important were total time of program 

— Iffl 



(the actual time from when a program 
begins to when it ends, including 
break time and commercial time; it 
does not include commercials or break 
time before and after a program) and 
total time of captions on screen (the 
time during which program captions 
are present on the screen; it does not 
include break time, commercial time, 
or program time during which no cap- 
tions are shown). All of the analysis in 
this study is based on total time of cap- 
tions on screen. Other kinds of infor- 
mation outputted by the computer 
program were total number of caption 
lines, total number of words, total 
number of-characters, mean number of 
caption lines per minute, mSa'n num- 
ber of words per line, mean number of 
characters per line, mean number of 
words per minute, and mean number 
of characters per minute. 

Editing Level 

People who are deaf and hard of hear- 
ing have repeatedly indicated through 
letters to caption companies that they 
prefer verbatim captioning. They know 
they are not always getting perfect ver- 
batim captioning because they some- 
times see an actor speak a word or 
group of words for which there is no 
caption on the screen. The problem is 
that no one seems to know how much 
editing is done and how much is lost 
in the conversion from audio to 
captioning. In the present study, 26 
programs (2 for each of 13 program 
types) were randomly selected, and for 
each program a sample of 10 minutes 
of audio was compared to the words 
that, had been captioned. The results 
were tabulated to give an indication of 
the percentage of audio usually cap- 
tioned for each program. 

Word Analysis 

What words are used in captioning? 
What is the frequency with which 
words appear in captions? To provide 
some insight into these questions, all 
the words in all the programs in the 
present study were combined into one 
large computer file. This file, which 



contained 834,726 words, was sorted 
and the 1 6 , 1 0 2 individual, unique 
words were arranged into a frequency 
distribution. 

Results and Discussion 

Program Characteristics 
A total of 205 programs were analyzed: 
183 regular programs and 22 short 
(two- to five-minute) music videos. 
Among the 183 regular programs, 78 
ran a. half-hour, 75 ran one hour to 90 
minutes, 25 ran two hours, and 5 ran 
more than two hours. Overall, there 
were roughly 180 hours of video. 

Caption Speed 

In Table 2, data on caption speed are 
provided by category for the 183 pro- 
grams analyzed for the present article. 
(We will discuss the 22 short music vid- 
eos separately.) For each program 
grouping, the mean, standard devia- 
tion, maximum value, minimum value, 
and range are given for words per 
minute (WPM), characters per minute 
(CPM), characters per word, caption 
lines per minute, words per line, and 
characters per line. For all programs, 
the mean values were 14 1 WPM, 736 
CPM, 5.2 characters per word, 38.7 
caption lines per minute, 3.7 words per 
line, and 19. 2 characters per line. WPM 
and CPM are the two indexes usually 
used to measure caption speed. WPM 
has more intuitive meaning for most 
people, even though it can be affected 
by differences in word length. 

In the present study, we found that 
roll-up captions generally present 
more words over a given period than 
pop-up captions (151 WPM vs. 138 
WPM), and that roll-up captions are 
used for a wider range of audio 
speeds, from very slow (74 WPM) to 
very fast (231 WPM). 

Sports programs and music specials 
had the slowest caption speeds. Sports 
are visual in nature, and most viewers 
take more interest in screen action than 
in commentary. Music specials follow 
the pace of the music, and the lyrics 
often are sung more slowly than they 



Volume 141, No. 4 



40 



Table 2 

Caption Speed Statistics 



All programs (N = 183) 



Roll-up captions ( N = 48) 



Pop-up captions (N = 135) 



Talk shows (N = 10) 



Sports ( N = 6) 



Soap operas {/V= 9) 



Situation comedies ( N = 26) 



Prime-time dramas ( N » 26) 



News { N = 20) 



Music specials ( N = 6) 



Live performances { N = 5) 
(no music) 



Children's educational (N = 10) 



Children's animation (A/ = 20) 



Children's action (N = 6) 



Film (N = 22) 



Documentary (N= 17) 



M 

SD 

Maximum 

Minimum 

Range 

M 

SD 

Maximum 

Minimum 

Range 

M 

SD 

Maximum 

Minimum 

Range 

M 

SD 

Maximum 

Minimum 

Range 

M 

SD 

Maximum 

Minimum 

Range 

M 

SD 

Maximum 

Minimum 

Range 

M 

SD 

Maximum 

Minimum 

Range 

M 

SD 

Maximum 

Minimum 

Range 

M 

SD 

Maximum 

Minimum 

Range 

M 

SD 

Maximum 

Minimum 

Range 

M 

SD 

Maximum 

Minimum 

Range 

M 

SD 

Maximum 

Minimum 

Range 

M 

SD 

Maximum 

Minimum 

Range 

M 

SD 

Maximum 

Minimum 

Range 

M 

SD 

Maximum 

Minimum 

Range 

M 

SD 

Maximum 

Minimum 

Range 





141 


736 


5.2 




38.7 


3.7 


19.2 


21 


108 


0.2 




6.0 


0.5 


2.7 


231 


1,171 


6.2 




55.3 


5.0 


25.9 


74 


357 


4.7 




19.1 


2.8 


14.0 


157 


814 


1.5 




36.2 


2.2 


11.9 


151 


781 


5.2 




34.8 


4.4 


22.5 


31 


165 


0.2 




7.2 


0.3 


1.8 


231 


1.171 


5.6 




55.3 


5.0 


25.9 


74 


357 


4.8 




19.1 


3.4 


16.3 


157 


814 


0.8 




36.2 


1.6 


9.6 


138 


719 


5.2 




40.0 


3.5 


18.1 


15 


73 


0.2 




4.9 


0.3 


2.0 


177 


832 


6.2 




49.6 


4.4 


22.9 


87 


463 


4.7 




24.4 


2.8 


14.0 


89 


369 


1.5 




25.2 


1.6 


8.9 


177 


897 


5.1 




40.4 


4.4 


22.2 


30 


151 


0.1 




6.4 


0.3 


1.3 


231. 


1.171 


5.3 




55.3 


5.0 


24.6 


142 


713 


4.9 




33.2 


4.1 


20.7 


89 


458 - 


0.4 




22.1 


0.9 


4.0 


106 


535 


5.1 1 




23.2 


4.6 


23.0 


15 


79 


0.1 




3.0 


0.2 


1.2 


126 


645 


5.2 




26.3 


4.9 


25.0 


88 


442 


4.9 




19.1 


4.1 


21.4 


38 


203 


0.3 




7.2 


0.7 


3.6 


154 


778 


5.1 




36.7 


4.2 


21.2 


15 


72 


0.1 




3.3 


0.3 


1.2 


178 


896 


5.2 




44.1 


5.0 


24.3 


138 


696 


4.9 




33.1 


4.0 


20.3 


40 


200 


0.3 




11.0 


1.0 


4.0 


147 


758 


5.2 




43.1 


3.4 


17.7 


10 


51 


0.1 




3.8 


0.3 


1.3 


162 


825 


5.4 




49.6 


4.0 


20.3 


119 


593 


5.0 




35.3 


3.0 


15.5 


43 


232 


0.4 




14.3 


1.1 


4.8 


146 


748 


5.1 




42.9 


3.4 


17.5 


10 


52 


0.1 




3.5 


0.2 


1.1 


164 


814 


5.4 




48.5 


3.9 


19.6 


120 


605 


4.9 




35.6 


3.2 


16.0 


45 


210 


0.5 




12.9 


0.7 


3.5 


157 


835 


5.3 




36.2 


4.3 


23.1 


15 


86 


0.2 




4.1 


0.3 


1.5 


183 


978 


5.7 




43.2 


4.9 


25.9 


123 


652 


4.9 




28.7 


3.9 


20.7 


60 


326 


0.7 




14.5 


1.0 


5.2 


107 


551 


5.2 




29.0 


3.7 


19.2 


24 


135 


0.2 




8.1 


0.5 


2.6 


144 


729 


5.4 




41.6 


4.5 


22.4 


74 


357 


4.8 




19.2 


3.2 


16.3 


70 


372 


0.6 




22.4 


1.3 


6.1 


137 


725 


5.3 




36.5 


3.7 


19.8 


19 


88 


0.1 




2.6 


0.4 


1.9 


156 


808 


5.4 




39.3 


. 4.4 


22.5 


115 


623 


5.2 




34.4 


3.3 


17.8 


41 


185 


0.3 




4.9 


1.1 


4.7 


124 


667 


5.4 




34.6 


3.5 


18.7 


18 


99 


0.2 




4.9 


0.3 


1.7 


154 


791 


5.7 




38.8 


4.1 


21.7 


87 


463 


5.0 




24.4 


3.1 


16.8 


66 


328 


0.7 




14.4 


1.0 


4.9 


125 


660 


5.3 




39.4 


3.2 


16.8 


13 


61 


0.2 




3.9 


0.2 


1.0 


148 


784 


5.7 




46.3 


■ 3.5 


19.0 


105 


574 


4.9 




33.4 


2.9 


15.2 


43 


210 


0.8 




12.9 


0.6 


3.9 


131 


685 


5.2 




40.2 


3.3 


17.0 


20 


101 


0.1 




5.0 


0.2 


1.4 


152 


788 


5.5 




45.7 


3.5 


19.1 


95 


494 


5.1 




33.2 


2.9 


14.9 


57 


294 


0.4 




12.6 


0.6 


4.2 


140 


710 


5.1 




41.3 


3.4 


17.3 


13 


59 


0.2 




3.9 


0.4 


1.9 


177 


832 


5.4 




47.9 


4.2 


20.5 


121 


607 


4.7 




32.1 


2.8 


14.0 


56 


225 


0.7 




15.8 


1.4 


6.4 


139 


766 


5.5 




35.7 


3.9 


21.6 


12 


43 


0.2 




3.4 


0.4 


1.7 


161 


829 


6.2 




45.6 


4.9 


25.4 


113 


698 


5.2 




31.0 


3.3 


18.1 


48 


131 


1.0 




14.6 


1.6 


7.3 




41, No. 4 




41 



American Annals oi thi- I >i : a» 




CCTV Presentation Speed and Vocabulary 



Table 3 

Speed Rates for Programs with Fastest and Slowest Captioning Captioning 




Programs with 
fastest captioning 



Later with Greg Kinnear 


Talk show 


roll-up 3-line 


231 


1171 


55 


4.2 


21 


5.1 


Last Call 


Talk show 


roll*up 3-line 


229 


1134 . 


46 


5.0 


25 


5.0 


Connie Chung 


News 


roll-up 3-line 


183 


920 


, 38 


4.8 


24 . 


5.0 


Guiding Light 


Soap opera 


roll-up 3-line 


178 


870 


36 


5.0 


24 


4.9 


Meet the Press 


Talk show 


roll-up 3-line 


177 


930 


40 


4.4' 


23 


• 5.3 






M 


199 


1005 


43 


4.7 


24 


5.0 


Programs with 


















slowest captioning 
















ABC Sports: Go A 


Sports 


roll-up 2-line 


94 


463 


20 


4.7 


23 


4.9 


TNT Basketball 


Sports 


roll-up 3-line 


88 


442 


19 


4.6 


23 


5.0 


Sesame Street 


Children's Ed. 


pop-on 


87 


463 


27 


3.2 


17 


5.3 


Billboard Music Awards 


Music special 


roll-up 3-!ine 


87 


430 


19 


4.5 


22 


5.0 


Whitney Houston 


Music special 


roll-up 3-line 


74 


357 


22 


3.4 


16 


4.8 






M 


86 


431 


22 


4.1 


20 


5.0 



would be spoken. The result is a 
slower caption rate. There are excep- 
tions, however, as we later show in the 
discussion of music videos. 

Although we found children’s pro- 
gramming to have a slow captioning 
rate, that rate was faster than expected. 
For children’s educational, animation, 
and action programs, the rates were 
124, 125, and 131 WPM, respectively. 
The overall mean for children’s pro- 
grams was 126 WPM. Program speed 
ranged from 87 WPM for Sesame Street 
to 154 WPM for Bill Nye the Science 
Guy. A trend toward faster caption 
rates for programs aimed at older chil- 
dren can be discerned; this initial find- 
ing, however, warrants more research. 

In the middle range of caption 
speed were performances (137 WPM), 
documentaries (139 WPM), films (140 
WPM), prime-time dramas (146 WPM), 
and situation comedies (147 WPM). 
These kinds of programs tended to 



cluster around the mean captioning 
speed of 141 WPM which was found 
for all 183 programs analyzed. 

Soap operas (154 WPM), news pro- 
grams (157 WPM), and talk shows (177 
WPM) had the fastest caption speeds. 
The mean speed for talk shows was 
boosted by two late-night programs, 
Later With Greg Kinnear (231 WPM) 
and Last Call (229 WPM). Table 3 pro- 
vides statistics on the programs with 
the five fastest and five slowest caption 
speeds. The five programs with the 
fastest speeds had a mean caption rate 
more than twice that of the five pro- 
grams with the slowest speeds. 

We had suspected that programs 
considered more difficult to read might 
have a longer mean word length. This 
was not the case. For example, al- 
though the captioning for Sesame Street 
was easier to read than for Meet the 
Press , the captions for both programs 
have a mean word length of 5.3 char- 



acters. More difficult material is not 
necessarily characterized by longer 
word length, and we cannot take word 
length as an indication of reading dif- 
ficulty. 

The music videos were analyzed as 
a separate category. Music videos were 
included in this study mostly as a mat- 
ter of curiosity because they represent 
a unique kind of caption material. The 
caption speed for the 22 music videos 
varied from 60 to 311 WPM, a much 
wider, range than was found in the 
regular program categories. In many 
music videos, images flash on the 
screen for a brief time. This makes cap- 
tions harder to read because the 
viewer’s attention is distracted. Rap 
music videos had the fastest and most 
difficult-to-read captions. For example, 
the captions for the song Freak It (311 
WPM) proved impossible to under- 
stand without repeated viewing. 

Caption Editing 

For each of the program categories, 
two programs were randomly selected, 
and a 10-minute segment of each was 
analyzed to see if there were any 
words spoken but not captioned. The 
results are provided in Table 4. Several 
programs were 100% captioned. The 
most heavily edited program was a golf 
program on the ABC network for 
which only 81% of the spoken words 
were captioned. This program was 
clearly an anomaly because it was cap- 
tioned live and roll-up captions were 
used, meaning that there were many 
times when captions could not be put 
on screen without obscuring a player 
in the act of putting or a ball rolling 
toward a cup. 

Among the 26 programs examined, 
the average was 94% captioned. When 
the golf program was excluded, the 
average was 95% captioned. To take a 
closer look at the material being ed- 
ited, we selected two programs and 
made a word-by-word inspection. A 
situation comedy, Hangin’ with Mr. 
Cooper ; was chosen because it was the 
most heavily edited program with pop- 
on captions (87% captioned). At 91% 



0 



o 



Volume 141, No. 4 



42 



Table 4 

Percentage of Captioned Audio 



Brogram type 


g r^m -title '/’"Percje 


rnsnam 


m. 


Soap opera 


The Bold and the Beautiful 


100 




Guiding Light 


100 


Documentary 


Wild America 


100 




Great Railroad Journey 


99 


Film 


Ace Ventura, Pet Detective 


98 




Madame Butterfly 


97 


Talk show 


Late Show with David Letter man 


99 




Tonight Show with Jay Leno 


96 


Live performance 


Clio Awards 


97 




Seigfried and Roy 


95 


Prime-time drama 


Arly Hanks 


97 


- 


ER 


94 


Music special 


Whitney Houston 


100 


— 


Billy Ray Cyrus Special 


91 


News 


ABC News 


98 




Today 


91 


Children’s action 


Power Rangers 


96 




California Dreams 


90 


Children’s animation 


Animaniacs 


97 




Batman - The Series 


89 


Children’s educational 


Kids Songs 


93 




Barney 


88 


Situation comedy 


In Living Color 


91 




Hangin' with Mr. Cooper 


87 


Sports 


CBS Sports: Figure Skating 


90 




ABC Sports: Golf 


81 





captioned, the Today show was cho- 
sen as an example of a heavily edited 
program with roll-up captions. 

Appendix Table 1 shows the 
changes made in a captioned segment 
of Rangin' with Mr. Cooper. The first 
column gives the exact words which 
were spoken. Most of the editing does 
not alter the meaning of the text. The 
changes usually do no more than pro- 
vide a slight simplification of the sen- 
tence structure. Perhaps some of the 
changes were made because the 
captioner’s supervisor gave instructions 
to caption at a certain \VPM rate. For 
example, replacing “you know you 
don’t have” with “you don’t have” saves 
wo words but has little effect on length 
or readability. Another possibility is that 
the studio provided the captioner with 
a script and the captioner captioned the 
program verbatim, but then the studio 
decided to go over the program again 
and "sweeten” the audio after it was 
captioned. 

Appendix Tables 2, 3, and 4 illustrate 
two different kinds of editing applied to 
the Today show. Parts of this program, 
such as the opening segment and news 
updates, follow a script; other parts, such 
as interviews, do not. For the scripted 
segments, the caption company is given 
a copy of the script before the show airs. 
The company converts the script to cap- 
tions and feeds these captions into the 
broadcast at airtime. The program’s hosts 
and other on-air staff see the script on a 
TelePrompTer, but they do not always 
say exactly the same words that they 
read. The result is "editing" in the form 
of ad-libbing. Appendix Table 2 is an ex- 
cerpt from a scripted segment in which 
several people are interacting. There is 
considerable ad-libbing. Appendix Table 
3 is an excerpt from a scripted segment 
consisting of straight news reporting. The 
newscaster stays with the script, and 
there is veiy little difference between the 
spoken and captioned versions. Appen- 
dix Table 4 is an excerpt from an inter- 
view on Today that was captioned live 
by a stenocaptioner. There is much edit- 
ing, but the essential information is still 
there. 



Word Aaialysis 

The caption scripts from all the pro- 
grams in the present study were com- 
bined into one large computer file. 
This file was edited to remove punc- 
tuation and anything else that was not 
a word. Certain nonstandard utterances 
such as uh, mmm, and ahh were kept, 
since they are commonly used in 
captioning to indicate certain sounds in 
the audio. The resulting word list was 
sorted and arranged into a frequency 
distribution. The file had 843,726 
words, of which 16,102 were unique. 
Just 10 words (the, you, to, a , I, and, 
of, in, it, that) accounted for 176,793 of 
the 834,726 words (21%). Half of all the 
words captioned were accounted for 
by 79 unique words. Just 250 words 
accounted for more than two-thirds of 
all the words used in the captions. The 
graph at Figure 1 depicts the cumula- 
tive frequency of the 4,000 most fre- 
quently occurring unique words. 



For comparison, the frequency dis- 
tributions of the words in about a 
dozen individual programs were exam- 
ined. All the cumulative frequency 
graphs for these programs were very 
similar. Figure 2 provides a cumulative 
frequency graph for the 678 unique 
words used in an episode of Wings, a 
situation comedy typical of those cur- 
rently shown on the air. For compari- 
son purposes, the graph also includes 
the cumulative frequency curve for the 
678 most frequently used words among 
all programs. The All Programs line 
provides a lower band for the fre- 
quency curve of any individual pro- 
gram, since it represents all unique 
words available among all programs in 
this study. In this Whigs episode, just 51 
unique words accounted for half of all 
words used in the captions and 174 
words accounted for 75% of the words 
used. The important point is that cap- 
tioned television (and, by inference, the 




141, No. 4 



.' 289 ! 




AmI-KICAN AnN'AI > • >* ' I M'.*.' 




CCTV Presentation Speed and Vocabulary 



Figure 1 

Cumulative Frequency Percentage for 4,000 Most Frequently Occuring 
Unique Words 




Figure 2 

Cumulative Frequency Percentage for Wings and All Programs 




audio that the captions represent) uses 
relatively few unique words. There are 
at least 500,000 words in the English 
language, but mastery of fewer than 
500 words will help a viewer to under- 
stand most of the vocabulary in any 
television program shown in the 
United States today. 



Conclusion 

This research has examined the statis- 
tical characteristics of the closed cap- 
tions in 205 television programs, a 
broad sampling of the material cur- 
rently available over broadcast and 
cable television. The overall mean cap- 
tion speed among all programs was 141 



}290j 



WPM, but this does not indicate the 
wide variation among television pro- 
grams. The slowest program (a Whitney 
Houston music special) had only 74 
WPM, while a late night talk show 
C Later With Greg Kinnear) had 231 
WPM. 

Most captioning shown today ap- 
pears to be near-verbatim. Variance in 
caption speed is mostly a function of 
the audio speed rather than a function 
of captioning techniques or editing. 
For example, the slow captions (74 
WPM) on the Whitney Houston pro- 
gram were compared to the program 
audio and were found. to be straight 
verbatim captioning. In the cases 
where considerable caption editing 
was found, there were usually good 
reasons. A golf program was found to 
have the most editing (only 81% of the 
audio was captioned), but this editing 
was done because the roll-up captions 
would have obscured on-screen action 
and seriously detracted from the pro- 
gram. When editing was found on pro- 
grams, much of it was attributable to 
program circumstances and techno- 
logical limitations, rather than careless 
captioning or a deliberate editing 
policy. Overall, captions match pro- 
gram audio about 95% of the time. 

Captions, and by extension the spo- 
ken language they represent, use rela- 
tively few unique words, but they use 
them often. Just 250 unique words rep- 
resented two-thirds of all 834,736 cap- 
tioned words in the programs. The 
captions on a typical half-hour program 
use about 700 unique words. It would 
seem that mastery of the use of just a 
relatively small number of words is im- 
portant to understanding captioning. 

Refe fliCr O S 

Jensema, C.J., Schildroth, A.N., & O'Rourke, S.W. 
(1975). Score conversion tables and age- 
based perceritile norms for standard achieve- 
ment test, special edition for hearing im- 
paired students. Office of Demographic Stud- 
ies. Washington, DC: Gallaudet College. 
Jensema, C.J., & Tiybus, R.J. 0978). Communi- 
cation patterns and educational achieve- 
ment of hearing impaired students. Office of 
Demographic Studies. Washington, DC: Gal- 
laudet College. 

Trybus, R.J., & Karchmer, M.A. 0977). School 
achievement scores of hearing-impaired chil- 
dren. National data on achievement status 
and growth patterns. American Annals of the 
Deaf 122(2). 62-69. 



Volume 141 , No. 4 



44 




Appendix Tables 



Appendix Table 1 

Changes in a Captioned Segment from Hangin’ with Mr, Cooper 



ilSppken;. 



% 5 






"SK* .m * 

md 




'SSSv: 



:rAdde;d:fe>/‘f^ 




tPi.Ax J. v* «' •• • *.••; 

TURN IT UP. 1 CANT HEAR ANYTHING. 


1 CANT HEAR ANYTHING. 




TURN IT UP. 


SHH! HE’S ON THE PHONE. 






SHH! HE'S ON THE PHONE. 


COME ON. BABY. 






COME ON, BABY. 


YOU KNOW YOU DON'T HAVE 


YOU KNOW 




YOU DON'T HAVE 


TO GO SHOPPING. 






TO GO SHOPPING. 


YOU KNOW WHAT BIG DADDY 






YOU KNOW WHAT BIG DADDY 


WANT FOR HIS BIRTHDAY. 






WANT FOR HIS BIRTHDAY. 


HOLD ON 






HOLD ON 


LET ME CALL YOU BACK. ALL RIGHT. 






LET ME CALL YOU BACK. 


WHAT DOES HE WANT? 






WHAT DOES HE WANT? 


HEY, BIG DADDY. 






HEY, BIG DADDY. 


WE’RE SORRY COUSIN MARK. 


COUSIN MARK 




WE'RE SORRY 


WE WERE JUST TRYING TO FIND OUT 


WERE JUST TRYING 


WANTED 


WE WANTED TO FIND OUT 


WHAT YOU WANTED 






WHAT YOU WANTED 


FOR YOUR BIRTHDAY. 






FOR YOUR BIRTHDAY. 


WELL YOU KNOW YOU TWO SHOULDN'T 


WELL YOU KNOW.. .TWO 




YOU SHOULD'NT 


BE EAVESDROPPING. 






BE EAVESDROPPING, 


CAUSE YOU NEVER KNOW 


CAUSE 




YOU NEVER KNOW 


WHAT YOU MIGHT HEAR, 






WHAT YOU MIGHT HEAR. 


LIKE HOW TYLER'S 






LIKE HOW TYLER'S 


PARENTS ARE SENDING HIM 


PARENTS ARE SENDING HIM 


BEING SENT 


BEING SENT 


TO MILITARY SCHOOL. 






TO MILITARY SCHOOL. 


THE FEW. THE PROUD. 






THE FEW, THE PROUD. 


THE BIG-HEADED. 






THE BIG-HEADED. 



Appendix Table 2 

Changes in Scripted Today Show Segment: People Interacting 



.Sp;oKen> ' 

• ■mmm: m . mmm 




; Add%dV’^ 






AND WELCOME TO TODAY 






>>>AND WELCOME TO TODAY 


ON THIS THURSDAY MORNING. 






ON THIS THURSDAY MORNING. 


I'M KATIE COURIC, 






I'M KATIE COURIC. 


AND I'M MATT LAUER. FILLING IN FOR 


FILLING IN FOR 




>>AND I'M MATT LAUER. 


BRYANT GUMBELL WHO IS ON 


GUMBELL WHO 




BRYANT IS ON 


VACATION THIS WEEK. 






VACATION THIS WEEK. 


AND MATT AHEAD IN OUR FIRST HALF 


AND MATT 




» AHEAD IN OUR FIRST HALF 


HOUR THIS MORNING. 


THIS MORNING 




HOUR, 


WE'RE GOING TO GET AN UPDATE 


RE GOING TO 


LL 


WE'LL GET AN UPDATE 


ON THE LATEST DEVELOPMENTS 






ON THE LATEST DEVELOPMENTS 


IN THE O.J. SIMPSON CASE 






IN THE O.J. SIMPSON CASE 


AND HEAR WHAT NICOLE BROWN 






AND HEAR WHAT NICOLE BROWN 


SIMPSON'S SISTER HAD TO SAY 






SIMPSON'S SISTER HAD TO SAY 


OUTSIDE THE COURTROOM. 






OUTSIDE THE COURTROOM. 


WE'LL ALSO LOOK 






WE'LL ALSO LOOK 


AT THE BIZARRE AND VERY TRAGIC 


VERY 




AT THE BIZARRE AND TRAGIC 


STORY OUT OF SWITZERLAND, 






STORY OUT OF SWITZERLAND. 


WHERE 48 PEOPLE DIED 






WHERE 48 PEOPLE DIED 


IN AMASS SUICIDE, 






IN A MASS SUICIDE. 


MATT. AND ANOTHER SAD 


MATT 


VERY 


»AND ANOTHER VERY SAD 


STORY THIS MORNING — KATIE 


KATIE 




STORY THIS MORNING 


THE PARENTS OF A YOUNG AMERICAN BOY 


AMERICAN 




THE PARENTS OF A YOUNG BOY 


KILLED BY BANDITS IN ITALY 






KILLED BY BANDITS IN ITALY 


A WEEK AGO TODAY, 






A WEEK AGO TODAY. 





B 



E rJ C 141.N°4 



45 



AmEIOCAN AnNAI S ■ *• -Ml r>! *' 



XZCTV Presentation Speed and Vocabulary 



Appendix Tables 



Appendix Table 3 

Changes in Scripted Today Show Segment; Straight News Report 



. -.2 «?*r\ v 

l ■, : , >. r*::*$\**V ? r$*' « *.A *£*;’, V;. 


<», v % 
«*r- . ..c " -*’. v 5 < , V 

Rem^ed ^v ? 


Added 


-fed fill' • 'A 
mwdC: 


THE GRIM SEARCH 






»>THE GRIM SEARCH 


CONTINUES THROUGH THE RUINS 






CONTINUES THROUGH THE RUINS 


OF BURNED-OUT HOMES 


HOMES 


HOUSES 


OF BURNED-OUT HOUSES 


IN SWITZERLAND. 






IN SWITZERLAND. 


IT’S THE AFTERMATH 






IT'S THE AFTERMATH 


OF AN APPARENT MASS SUICIDE 






OF AN APPARENT MASS SUICIDE 


BY MEMBERS OF A DOOMSDAY 






BY MEMBERS OF A DOOMSDAY 1 


CULT THAT HAS LEFT AT LEAST 






CULT THAT HAS LEFT AT LEAST 


50 PEOPLE DEAD 






50 PEOPLE DEAD 


IN SWITZERLAND AND IN CANADA. 


!N 




IN SWITZERLAND AND CANADA. 


DETAILS NOW FROM NBC'S 


NOW 




DETAILS FROM NBC'S 


KEITH MILLER. 






KEITH MILLER. 


THE POLICE SAY THE DEATH 






»THE POLICE SAY THE DEATH 


TOLL COULD GO HIGHER. 






TOLL COULD GO HIGHER. 


INVESTIGATORS WAITED UNTIL 






INVESTIGATORS WAITED UNTIL 


THIS MORNING TO SEARCH 






THIS MORNING TO SEARCH 


A BURNT-OUT SKI CHALET 


T 


ED 


A BURNED-OUT SKI CHALET 


FEARING IT COULD BE 






FEARING IT COULD BE 


BOOBY-TRAPPED. 






BOOBY-TRAPPED. 



Appendix Table 4 

Changes in Unscripted Today Show Segment: Captioned Live by Stenographer 





.7 Remold '"4 del£f£<§ 


^Ca p t i o n ; , C# 0 • 

t; ■ • • ■ 


WHAT HAPPENED? 






WHAT HAPPENED? 


»WELL UH. INDIVIDUAL INVESTORS 


WELL, UH 




»INDIVIDUAL INVESTORS 


ACTUALLY HUNG IN THERE. 






ACTUALLY HUNG IN THERE. 


THE MARKET WAS DOWN THE 






THE MARKET WAS DOWN THE 


WORST WE'VE HAD ALL YEAR. 






WORST WE’VE HAD ALL YEAR. 


MOSTLY BECAUSE TECHNOLOGY STOCKS 


MOSTLY 




BECAUSE TECHNOLOGY STOCKS 1 


TOOK A REAL HIT. 






TOOK A REAL HIT. 


>>AND DO YOU NOW RECOMMEND THAT. UH. SMALL 


AND. NOW, THAT. UH. 




DO YOU RECOMMEND SMALL 


INVESTORS GET BACK INTO 






INVESTORS GET BACK INTO 


TECHNOLOGY STOCKS? 






TECHNOLOGY STOCKS? 


»YES 


>>YES 






>>AS A LOT OF PEOPLE ARE DOING 


AS 




A LOT OF PEOPLE ARE DOING 


RIGHT NOW. 


RIGHT NOW 


THIS? 


THIS? 


>> THEY ARE. THEY ARE. 


THEY ARE. 




>>THEY ARE. 


THEY HAVEN'T HAD MUCH CHANCE TO 


MUCH 




THEY HAVEN’T HAD A CHANCE TO 


GET INTO THESE THINGS AT LOWER 


THINGS 




GET INTO THESE AT LOWER 


PRICES. 






PRICES. 


BUT THEY'VE DONE THAT AND ALREADY THEY'VE 


BUT, ALREADY THEY'VE 


HAVE 


THEY'VE DONE THAT AND HAVE 


COME BACK QUITE STRONGLY. SO 1 THINK 


QUITE, SO 1 THINK 




COME BACK STRONGLY. 


IT IS TIME TO GET BACK INTO 


IS 


’S 


IT’S TIME TO GET BACK INTO 


TECHNOLOGY. 






TECHNOLOGY. 



1292] 



o 

ERIC 



Volume 141, No. 4 



48 




Viewer Reaction to Different 
Captioned Television Speeds 



Carl Jensema, Ph.D. 



Institute for Disabilities Research and Training, Inc. 
2424 University Boulevard West 
Silver Spring, MD 20902 

Phone: 301-942-4326 v/tty 
301-942-4439 fax 

Internet: http ://www. ID RT.com 



June 1997 



Abstract 



A series of 24 short, 30-second video segments captioned at different speeds were shown to 
578 people. The subjects used a five-point scale (Too Fast, Fast, OK, Slow, Too Slow) to make an 
assessment of each segment’s caption speed. The “OK” speed, defined as the speed at which 
“Caption speed is comfortable to me,” was found to be about 145 Words Per Minute (WPM). Most 
subjects did not seem to have significant trouble with the captions until the rate was at least 170 

WPM. 



People who could hear wanted slightly slower captions. However, this seemed to relate to 
how often people watched captioned television. Frequent viewers were comfortable with somewhat 
faster captions. Age and sex were not related to the caption speeds people were comfortable with. 
Education had no relation to caption speed except that people who had attended graduate school 
might prefer slightly faster captions. 




48 



Introduction 



Since it first appeared on television broadcasts on March 16, 1980, close captioned television 
has become an important factor in the education and entertainment of people who are deaf or hard of 
hearing. There are over 500 hours of closed captioned television programming shown each week 
and the number of hours is steadily increasing. By the turn of the centuiy, most programs shown on 
television are expected to be closed captioned. 

This outpouring of televised material for people who are deaf or hard of hearing has raised 
many questions concerning how well the captions fit their intended audience. One of the major 
issues is caption speed. When closed captions were first shown, they were usually edited down to 
120 Words per minute (WPM) or less. Since then, most caption companies have adopted a policy of 
captioning every word spoken. This change was made partly in response to viewer comments and 
partly due to the cost of editing. Unfortunately, relatively little -is known of the relationship between 
caption speed and the reading skills and preferences of the viewers. The author of this article has 
been working for several years to investigate this relationship. 

This is the second in a series of research studies related to the speed with which captions are 
presented on television programs. The first study (Jensema, McCann, and Ramsey, 1996) examined 
over 200 closed captioned television programs and calculated the caption presentation speed of each. 

The mean caption speed among all programs was 141 WPM, with considerable variation for 
different types of programs. 

The second study, the results of which are presented here, measured how comfortable people 
were with different caption speeds. This was done by showing them a series of captioned video 
segments and asking them how they liked the caption speed. 



Procedure 



Experimental Materials 

The materials in this project were a series of 24 short, 30-second video segments, each 
captioned at a specific speed. Subjects watched each segment and made an assessment of the 
segment's caption speed. The video segments were developed specifically for this project. 

Three topics were selected for the video tape materials: Sailing, Space, and the Nation s 
Capital. Posters were obtained for each topic, with care being taken to select posters which were 
relevant to the topic, but did not give information related to the captions. A 30-second video was 
shot of each poster, with the camera being moved around the poster to give the illusion of a moving 
picture. The idea was to create interesting video images related to the topic to distract the viewer 
without duplicating information given in the captions. For example, if the captions talked about the 
White House, an image of some other Washington building would be shown. 



1 




49 



Each topic was introduced with a simple name given on a blank screen and had eight 30- 
second video segments. Each segment was separated by ten seconds of blank screen on which a 
printed message was shown telling the subjects to mark their papers. To control for audio 
information, the tapes were completely silent and had no audio of any kind. 

• A caption script was developed for each of the three topics. These scripts were divided into 
eight parts, one for each of the eight video segments of the topic. Each part of the caption script had 
a specific number of words in it which reflected the caption speed. For example, a segment 
captioned at 1 10 WPM would have exactly 55 words. 

The caption speeds used were 96, 110, 126, 140, 156, 170, 186, and 200 WPM. The order 
of these speeds was randomly varied for each topic, with care being taken so that extreme speeds did 
not follow one another. For example, a 96 WPM segment was never followed by a 200 WPM 
segment. The objective was to avoid sudden extreme changes in caption speed that might artificially 

influence subject assessment. 



The words of the script for each topic were encoded on the tapes as closed captions. A 
short, two-segment topic on the subject of "Art" was created as practice material to be put at the 
beginning of each tape. Then a total of six different experimental tapes were made. Each tape 
representing a different order of the three topics (123, 132, 213, 231, 312, and 321.) Each final 
version of the experimental tape had the two "Art" topic practice sessions followed by the three 
experimental topics in a particular order. 



Data collection instrument . 

All subjects were given a spoken and signed introduction, and then handed a six-page data 

collection instrument. This instrument contained more introductory material and room for the 
subjects to record their responses to four things: 

1 . A background questionnaire. 

2. A simple vision test. 

3. A practice video. 

4. Three captioned videos. 

There were separate background questionnaires for adults and students. Both contained 
items for age, sex, hearing loss, number of people in household, and television viewing habits. In 
addition, the adult questionnaire asked for educational background and employment information, 
while the student questionnaire asked for the student's grade. 

A simple vision test was given to all subjects. This was done to assure that they were 
physically able to see the captions on the television screen. A simple eye chart was placed on the 
screen and the subjects were asked to copy the letters of the eye chart onto a blank paper form. The 
smallest characters on the eye chart were considerably smaller than the caption characters, assuring 
that anyone who could copy the eye chart could see the captions clearly. The results of copying the 
eye chart were examined before the test videos were shown. Anyone having problems filling out the 



2 




50 



eye chart was moved closer to the screen. 

The third part of the data collection instrument gave a definition of the response categories to 
be used and a place for the subjects to mark their responses to the two practice video segments. The 
response categories used in this study and their definitions were: 

Category Definition . 

Too Fast Captions should be slower. Hard to read the captions. I miss some words. 

p ast Captions should be slightly slower. Captions should be on the screen a little longer. 

OK Caption speed is comfortable to me. 

Sl ow Captions should be slightly faster. Captions are on the screen a little too long. 

Too Slow Captions should be much faster. I am bored while reading them. 

After viewing a video segment, each subject marked a category box corresponding to his or her 
judgement of the caption speed. 

The fourth part of the data collection instrument consisted of forms for the subjects to use in 
recording their responses to the experimental video segments. The layout of these forms was the 
same as for the two practice video segments. 

Experimental procedure 

All subjects were seated about 10 feet from a 27-inch television set. The experimenter gave a 
brief introduction to the study and handed out the data collection instrument. The subjects filled out 
the background questionnaire and copied the eye chart characters from the television screen to their 
paper form. The experimenter obseiwed them while they copied the eye chart and anyone having 
problems was urged to move closer to the screen. 

The categories to be used for assessing caption speed were explained and the two practice 
videos were shown. Any questions the subjects had concerning the caption assessment were 

answered. 

The subjects then viewed all 24 captioned video segments without interruption except to 
mark their forms There was a 10-second gap between segments for this purpose. The experimenter 
observed the subjects and paused the tape if the 10-second gap was not enough time for everyone to 
finish marking their form. Most subjects had enough time and it was seldom necessary to pause the 

tape. 

After all 24 experimental video segments had been shown, all papers were collected from the 
subjects and there was a short discussion during which any questions the subjects had were 
answered. Finally, each subject was given $5 as an honorarium for taking part in the study. 

Data was collected from 578 subjects, coded, and entered into a computer file. Because of 
careful experimental administration, there was very little missing data. The data file was checked for 



3 




51 



accuracy, and then subjected to a statistical analysis, the results of which are presented in the next 
section. 



Results 



Composite Scores , , ... . r ., 

Each subject's overall score for each topic was calculated by adding up the response for the 

eight segments of the topic and dividing by 8. The mean for each topic over all subjects was then 
calculated and the results are given in Table 1 . There was no significant difference between the 
scores on the three topics. Since there was no significant difference between topics, it was decided 

to create and work with composite scores. 



Topic 

Washington D.C. 
Space Shuttle 
Sailing 



Table 1 

Scores for Each Topic- 
(N= 573) 

Mean St. Dev 

3.02 0.93 

3.13 0.93 

3.09 0.94 



The scores on the three topics for each subject were added together and divided by 3 to get 
across-topic composite scores for each speed on each subject. Table 2 gives the mean and standard 
deviation of the composite score for each speed. Adding together the subject’s composite scores for 
each speed and then dividing by 8 created an overall composite score. The mean of the overall 
composite score was 3 .09 and the standard deviation was .39. Figure 1 shows a histogram of the 
overall composite scores and indicates they form a reasonable approximation of a normal 
distribution. In the remainder of this study, analysis will focus on the composite scores. 



Comfortable Caption Speed „ , , 

In the score coding used, "3" indicates the caption speed is "OK", defined as Caption speed 

is comfortable to me." A higher score indicates the caption speed is faster than is comfortable, and a 
lower score indicates the captioning is slower than is comfortable. Table 2 indicates that a mean 
score of "3" would be associated with a caption speed ofbetween 140 and 156 WPM. Using simple 
interpolation, the "OK" speed is estimated at 145 WPM. Figure 2 shows this graphically. 



Table 2 

Scores at Each Caption Speed 
(N = 573) 



Speed 



(WPM) 


Mean 


St. Dev. 


96 


2.21 


0.68 


110 


2.61 


0.54 


126 


2.79 


0.51 


140 


2.89 


0.47 


156 


3.22 


0.48 


170 


3.49 


0.55 


186 


3.60 


• ~..0.62 


200 


3.95 


0.66 


Combined 


Speeds 


3.09 


0.39 



Hearing Status 

The scores were broken down by whether the subject was deaf, hard of hearing, or hearin 
Table 3 gives the mean score for subjects in each hearing category at each caption speed. Figure 
shows this in a graphic format. The differences between groups were especially noticeable at higher 
captioning speeds. Overall, the mean score was 3.01 for deaf subjects, 3.04 for hard of hearing 
subjects and 3 18 for hearing subjects. An analysis of variance indicated a significant difference 
between the groups on overall scores (F=12.572, df 2/569, p< 0001). The basic conclusion is that 
the more hearing people had, the slower they wanted the captions to be. 



Table 3 

Mean Score by Hearing Status 
(N = 573) 





96 


110 


Deaf 


2.32 


2.61 


HOH 


2.19 


2.65 


Hearing 


2.12 


2.60 


All Subjects 


2.21 


2.61 



Words Per Minute 



126 


140 


156 


2.77 


2.86 


3.12 


2.68 


2.83 


3.22 


2.84 


2.93 


3.29 


2.79 


2.89 


3.22 



186 


200 


Overall 

Score 


3.35 


3.68 


3.01 


3.54 


3.82 


3.04 


3.81 


4.20 


3.18 


3.60 


3.95 


3.09 



3.35 

3.44 

3.63 

3.49 



5 




53 



CD 



Viewing Frequency . . , , , 

It was expected that the hearing subjects would want slower captions because they had less 

experience watching captions and were not used to reading them. An analysis was done of how 

often people watched captioned television. The categories for this variable were Daily Weekly, 
"Monthly", "Yearly", and "Never". It was found that there was no significant difference between the 
scores for the "Weekly" and "Monthly" categories, and between the "Yearly" and "Never" 
categories, so these were combined. The final categories used were "Daily", Weekly/Monthly , and 

"Yearly/Never". 



Table 4 shows the number of subjects according to their hearing status and the frequency 
with which they watch captioned television. The frequencies in Table 4 are very significant (chi- 
square=266.218, df=4, p< 0001). Deaf and hard of hearing people tend to watch captioned 
television daily and hearing people seldom watch it . ^ 









Table 4 












How 


Often Captions are 


Watched 








Deaf 


HOH 


Hearing 


All Su 


bjects 




N 


% 


N 


% 


N % 


N 


% 


Daily 


169 


83 


74 


68 


30 11 


273 


48 


Weekly / Monthly 


20 


10 


19 


17 


81 31 


120 


21 


Yearly / Never 


14 


7 


16 


15 


151 58 


181 


32 


All Subjects 


203 


100 


109 


100 


262 100 


574 


100 



As previously mentioned, comfortable caption speed has a relation to the frequency with 
which people watch captioned television. Table 5 gives the mean of the overall score for each 
caption viewing frequency category. Over all subjects, people who seldom watch captions tend to 
want slightly slower captions (df=2/568, F=14.83S, p<0001). 



Table 5 

Mean Overall Scores by Caption Viewing Frequency 

(N = 573) 



Viewing Frequency 



Daily 

Weekly / Monthly 
Yearly / Never 



Mean Overall Score 
3.01 
3.12 
3.20 



All Frequencies 



3.09 



The questionnaire also asked subjects how many years they had been watching closed 
captions. Number of years of caption viewing had no relationship to how comfortable different 

caption speeds were. 



It was originally thought that there might be a relationship between age and the caption 
speeds an individual thought were comfortable. Teenagers might prefer slower captions because 
they are still in the process of being educated. Subjects over 40 years of age might prefer slower 
captions because eyesight usually begins to deteriorate at about that age. However, examination of a 
scatter plot between overall score and age showed that there was no relationship between age and 
comfortable caption speed. The correlation between age and overall score was r - .1 1, clearly non- 
significant. 



The mean overall scores for males and females were 3.04 and 3.14, respectively. This is 
significant (df=571, t=3 .00 1 , p=. 0028), but the difference could be traced to hearing status. When 
hearing status was controlled, there was no significant difference in caption speed scores between the 

two sexes. 

Education , , 

The adult subjects were asked the highest level of education they had completed, i he 

responses of those who answered (n=402) were coded into "High School or Less", "Trade School or 
College", and "Graduate School". The mean overall scores for these three categories were 3.15, 

3 1 5 and 3 03 Subjects who had attended graduate school prefer slightly faster captions, but the 
results were not quite significant (df=2/399, F=2.776, p= 0635). Educational level does not appear 
to play a meaningful role in caption speed considered comfortable by adults. 

A total of 120 students indicated the school grade they were in. No significant difference in 
overall caption speed score was found between grades. 

School-Aged Deaf and Hard of Hearing Subjects 

In tliis study we were especially interested in the caption speed scores of school-aged deaf 
and hard of hearing people because of the potential educational impact of captioning. The study had 
160 deaf and hard of hearing subjects under the age of 20. All but 13 of these students were 
teenagers. The mean age was 15.2 years, with a standard deviation of 2.2 years. There were 94 
male and 66 female subjects, with 106 being deaf and 54 being hard of hearing. 

The means of the scores at each speed and the overall score are given in Table 6. These 
means are very close to those given in Table 2 for all subjects in the study and the overall 
comfortable speed is estimated to be around 147 WPM. This indicates that deaf and hard of hearing 
teenagers are most comfortable at approximately the same caption speeds as the overall viewing 
population. 



7 




55 



Table 6 

Scores for Deaf and 
Hard of Hearing Teenagers 



(N 


= 160) 




Words Per Minute 


Mean 


Std. Dev 


96 


2.21 


0.77 


110 


2.-60 


0.63 


126 


2.72 


0.53 


140 


2.89 


-0.-57 


156 


3.15 


0.49 


170 


3.38 


0.61 


186 


3.39 


0.65 


200 


3.73 


0.74 


All Speeds 


3.01 


0.41 



Table 7 gives the frequency with which the students reported watching caption television. 
The results are extremely interesting, with 12 percent of the students reporting that they watched 
captioned television "Yearly/Never". These responses were noted during data collection and some 
of the subjects were questioned about them. Many of the respondents who report that they seldom 
watch captioned television were day students who came from poor inner-city homes with old (pre- 
July 1993) television sets which did not have caption decoders built in. These students had little 
access to captioned materials, a major educational disadvantage for them. They did watch some 
captioned television as part of their schoolwork, but they consider this "work." To them, "watching 
captioned television" means recreational viewing at home. 

Table 7 

Frequency of Caption Viewing by 
Deaf and Hard of Hearing Teenagers 





N 


% 


Daily 


112 


71 


Weekly / Monthly 


26 


17 


Yearly / Never 


19 


12 


All D/HOH Teens 


157 


100 



8 




56 



Deaf students and hard of hearing students did not differ significantly in frequency of caption 
television viewing. There was also no significant relationship between viewing frequency and 
caption speed comfort. 

Discussion 

A previous study by Jensema, et. al. (1996) indicated that the overall mean speed of 
captioned television programs is 141 WPM, with a standard deviation of 21 WPM. _ A major goal of 
the study reported here was to determine how this compared with the caption speeds with which 
people were most comfortable. The data indicate that the mean caption speed that “is comfortable to 
me” is about 145 WPM, very close to the 1 4 1 WPM mean rate actually found in television programs. 
This study used 30-second video segments and watching these is obviously not directly comparable 
to watching a full-length television program. However, the results are suggestive and indicate that 
the caption speed rates used today are comfortable for most viewers. 

Of particular interest in this study was the adaptability exhibited by the respondents. As 
caption speed increased, the respondents recognized this, but most seemed able to adjust and did not 
appear to consider the captions unacceptable. Table 2 shows that at 170 WPM the mean score was 
3 49 about halfway between "Caption speed is comfortable to me" and "Captions should be slightly 
slower. Captions should be on the screen a little longer." This suggests that most viewers are able 
to adjust to higher captioning rates and will not object to verbatim captions when the audio rate 
picks up. 

Another way of looking at this is to determine how many subjects checked the "Too Fast" 
category' at different caption speeds. This category' was defined as "Captions should be much slower. 
Hard to read the captions. 1 miss some words." The percentages of subjects checking Too Fast at 

various caption speeds were 200 WPM - 28%, 186 WPM - 12%, 170 WPM - 9%, 156 WPM - 4%, 
140 WPM - 1%. Apparently, most subjects do not seem to have significant trouble with the captions 
until the caption rate is at least 1 70 WPM. The mean speed of captioning shown on television today 
(141 WPM) certainly seems acceptable. Only about 1% would consider 141 WPM "Too Fast . 

It was expected that hearing people would not depend on captions and would have less 
practice in reading captions. Because of this, hearing people were expected to want slower captions. 
Table 3 showed that the more hearing people had, the slower they wanted captions to be. Table 5 
showed that the less subjects viewed captions, the slower they wanted the captions to be. 

The experimental tapes in this study had no audio and hearing people became effectively 
"deaf for purposes of the experiment. The score differences in Tables 3 and 5 are not large, and the 
findings suggest that a newly deafened person needs relatively little practice to adjust to reading 
television captions. This conclusion was also supported by the finding that number of years of 
caption viewing had no relation to the scores. People apparently adjust to caption reading quickly 
and practice beyond this makes little difference. 



9 




57 



A very important issue, one that was not covered in this study, is the age at which 
caption speed begins to matter. The study had only a few subjects under the age of 13. Certainly, 
most children are reading captions at a much younger age, but how young and how fast can they 
read? Further work is needed to determine the age at which children start to read captions and the 
speeds they can handle as their caption reading skills improve. 



References 

Jensema, C„ McCann, R., and Ramsey, S. (1996). Closed-Captioned Television Presentation Speed 
and Vocabulary' . American Annals of the Deaf . 14 1(4), 284-292. 



Funds for this research study and others were provided by the U.S. Department of Education under 
grant number IT180G40037 for “Presentation Rate and Readability of Closed Caption Television.” 
The amount of the award was $379,080. 



10 




58 



Figure 1 - Histogram of Overall Scores 



o 

CD 




+ 

8 



On 

ON 

CD 

id 

CD 



^r 

r- 

CD 

O 

ID 

CD 



ON 

cD 

co 

(N 

CD 



xr 

(N 

CD 

8 

CD 



On 

On 

(N 

VO 

r- 

<N 



<N 

0 

CD 

01 



ON 

oi 

CD 

(N 

oi 



ID 

CO 

oi 

V 



fouanba-ij 



CD 

CJ 

O 

O 

CO 



CD 

ID 



o 

ERLC 



Figure 2 - Evaluation of Caption Speed 



cv 

CO 





Words Per Minute 



Figure 3 - Evaluation of Caption Speed by Hearing Status 



CO 




o 

O 

CM 



CD 

00 



o 

r^- 



(D 

CO 



o 



CD 

rvi 



CD 

CT> 



co 

CO 



OJODg ueoj/V! 







Words Per Minute 



Word Frequency 
1 



Word Frequency in Captioned Television 



Carl Jensema, Ph.D. 

Institute for Disabilities Research and Training, Inc. 
2424 University Boulevard West 
Silver Spring, MD 20902 
. -(301)942-4326 V/TTY 
Email: IDRT@aol.com 



Michele R. Rovins, M.A. 

Institute for Disabilities Research and Training, Inc. 
2424 University Boulevard West 
Silver Spring, MD 20902 
(301)942-4326 V/TTY 
Email: MROVINS@aol.com 




65 



Word Frequency 
2 

Introduction 

Reading is often one of the main ways deaf people gain information and develop 
independence in learning. In recent years, television captioning has become a prime source of 
reading material. As Koskinen, Wilson, and Jensema (1985) noted, "Captions are reading 

material They can turn television into a moving story book, a steady stream of written 

language presented with both video and audio reinforcement. Viewers can see words on the 
screen, hear them spoken, and see them put into a visual context. One of the most exciting 
potential applications of closed captioning is its use as an educational tool." 

The use of captioned television as reading material is difficult if there are no reading skills 
to begin with. People need some starting point, the ability to read at least some words. In this 
study, a relatively short list of frequently used words is presented. The authors believe that 
mastery of these words can greatly assist in expanding reading skills. 

The report presented here is based on research by Jensema, McCann, and Ramsey (1996). 
They obtained and analyzed caption data from 183 television programs and 22 music videos. The 
programs varied from thirty minutes to four hours, and the music videos were between two and 
five minutes in length. The research examined speed, word length, and similar characteristics of 
the captions. It was noted that relatively few distinct words accounted for a large proportion of 
the total words used in the captions. 

In the present article, the observation that few words account for a large part of the total 
words used in captioning is carried further and the data is analyzed in more detail. The result is a 
caption word frequency list, the mastery of which is likely to provide an important assist to the 



Word Frequency 
3 

reading skills of caption viewers. 

Method 

The caption scripts from all the programs in the study by Jensema, et al (1996) were 
combined into one large computer file. The file was edited to remove punctuation and anything 
else which was not a word. The resulting file had 834,726 words. 

It was decided that many words were merely variations of another word. Word endings of 
"s", "es", "ed", "ing", and "d" were deleted. On the other hand, certain endings created a new 
word which had a different meaning. It was decided to keep word endings of "ly", '"t", "ive", 

"ion", "er", and '"re". Certain non-standard "words", such as "uh", "mmmm" and "ahhhh" were 
kept, since they are commonly used in captioning to indicate certain sounds in the audio. 

The resulting edited 834,726 word list was sorted alphabetically, duplicate words were 
counted and then deleted, and the remaining list was sorted by frequency of occurrence. The final 
frequency list had 16,102 unique words, most of which were used only a few times. 

Results 

Table 1 presents a frequency count of the 250 words used most often in the television 
captions in this study. Out of 834,726 captioned words, 30,142 were the word "the", 22,600 
were the word "you", and so on. 

In Table 1, the word "the" accounted for 3.61% of the 834,726 captioned words. The 
words "the" and "you" together accounted for 6.32 % of the 834,726 captioned words. 

Continuing in this manner, Table 1 shows that 250 unique words account for over 68% of all the 
words used in captioned television. 



Word Frequency 
4 

Discussion 

The implications of Table 1 are striking. There are more than 500,000 words in the 
English language, but a person who masters the use of the 250 words in Table 1 will recognize 
more than two-thirds of all words shown in television captions. This is a tremendous advantage 
for any person with limited reading skills who attempts to read captioned television. 

A beginning reader could be taught just 10 words (the, you, to, a, I, and, of, in, it, that) 
and would then recognize more than one out of every five words which appeared on a captioned 
television program. Being able to read 79 words means being able to read half of all words 
captioned. By using Table 1 as a guideline in teaching reading, a teacher can maximize the 
captioned words a student will recognize while watching television. It is suggested that teachers 
of deaf and hard of hearing students consider Table 1 carefully in planning their strategy for 
teaching reading. 

The majority of the words on the list are everyday linking words, including many 
prepositions and pronouns. Prepositions, in particular, are traditionally problem areas for many 
deaf students because American Sign Language does not have prepositions. 

Research shows that students learn vocabulary both definitionally and contextually (Stahl 
& Fairbanks, 1986). The words in Table 1 can be taught definitionally in context. Those students 
who develop a working knowledge of the 250 words will be able to apply them in a variety of 
situations and will be able to focus on other captioned television words that they may not 
understand. 




68 



Word Frequency 
5 



References 

Jensema, C., McCann, R. & Ramsey, S. (1996). Closed-captioned television presentation speed 
and vocabulary. American Annals of the Deaf, 141{ 4), 284-292 

Koskinen, P.S., Wilson, R. M. & Jensema, C. J. (1985). Closed-captioned television: Anew 
tool for reading instruction. Reading World, 24, 1-7. 

Stahl, S.A. & Fairbanks, M. M. (1986). The effects of vocabulary instruction: A model-based 
meta-analysis. Review of Educational Research, 56, 72-1 10. 



Table 1 

Frequently Used Words 



Word 


Freq. 


Cum. 

% 


Word 


Freq. 


Cum. 

% 


Word 


Freq. 


Cum. 

% 


Word 


Freq. 


Cum. 

% 


Word 


Freq. 


THE 


30,142 


4 


IF 


2,751 


42 


US 


1,381 


54 


LET 


863 


61 


AM 


598 


YOU 


22,600 


6 


WANT 


2,730 


43 


I'LL 


1,369 


54 


LIFE 


859 


61 


LONG 


593 


TO 


22,161 


9 


AS 


2,714 


43 


YES 


1,364 


55 


OTHER 


852 


61 


ASK 


587 


A 


20,023 


11 


NOW 


2,696 


43 


HE'S 


1,359 


55 


NIGHT 


831 


61 


TODAY 


587 


I 


19,991 


14 


SHE 


2,686 


44 


THANK 


1,352 


55 


THEY'RE 


829 


61 


NAME 


583 


AND 


16,130 


16 


THINK 


2,606 


44 


LITTLE 


1,351 


55 


HELP 


805 


61 


RUN 


583 


OF 


13,914 


17 


HER 


2,591 


44 


LOVE 


1,340 


55 


HAPPEN 


802 


61 


PLACE 


581 


IN 


10,941 


19 


GO 


2,584 


45 


WHY 


1,278 


55 


WHAT’S 


800 


62 


STOP 


580 


IT 


10,496 


20 


WILL 


2,522 


45 


REALLY 


1,263 


56 


THOSE 


784 


62 


WHICH 


570 


THAT 


10,395 


21 


WELL 


2,442 


45 


TELL 


1,256 


56 


THAN 


782 


62 


SORRY 


566 


IS 


8,764 


22 


GOING 


2,428 


45 


OVER 


1,249 


56 


FIND 


776 


62 


FRIEND 


564 


THIS 


7,116 


23 


HIS 


2,409 


46 


CALL 


1.241 


56 


LAST 


760 


62 


BETTER 


563 


FOR 


6,679 


24 


GOT 


2,375 


46 


CAN'T 


1,192 


56 


WORLD 


760 


62 


THROUGH 


562 


ON 


6,411 


25 


FROM 


2,373 


46 


WHERE 


1,179 


56 


AFTER 


756 


62 


HOUSE 


559 


WAS 


5,945 


25 


THAT'S 


2,343 


47 


SAID 


1,169 


56 


SHE’S 


743 


62 


DOES 


558 


HAVE 


5,804 


26 


LOOK 


2,324 


47 


DAY 


1,163 


57 


MR 


741 


62 


FAMILY 


555 


ME 


5,740 


27 


HIM 


2,316 


47 


NEVER 


1,158 


57 


EVEN 


740 


62 


KIND 


554 


WE 


5,521 


27 


YOU'RE 


2,285 


47 


SOMETHING 


1,158 


57 


HOME 


735 


62 


MAY 


551 


WHAT 


5,464 


28 


TIME 


2,243 


48 


WE'RE 


1,155 


57 


AGAIN 


727 


62 


MOST 


548 


BE 


5,449 


29 


WHEN 


2,231 


48 


THEN 


1,140 


57 


MADE 


719 


63 


GOD 


530 


HE 


5,218 


29 


SEE 


2,230 


48 


TWO 


1,133 


57 


BIG 


718 


63 


WOMAN 


524 


WITH 


4,895 


30 


HOW 


2,214 


48 


BECAUSE 


1,115 


57 


DOING 


718 


63 


MANY 


512 


MY 


4,834 


31 


SAY 


2,200 


49 


THEIR 


1,089 


58 


PLEASE 


712 


63 


HI 


510 


YOUR 


4,385 


31 


GOOD 


2,155 


49 


HEY 


1,087 


58 


PUT 


711 


63 


NOTHING 


509 


DO 


4,375 


32 


BY 


2,115 


49 


FIRST 


1,065 


58 


LOT 


709 


63 


NEXT 


508 


I'M 


4,258 


32 


HAD 


2,041 


49 


NEED 


1,049 


58 


SHOULD 


700 


63 


MOVE 


503 


ARE 


4,224 


33 


YEAH 


1,971 


50 


TOO 


1,048 


58 


BEFORE 


694 


63 


ANOTHER 


499 


ALL 


4,129 


33 


AN 


1,968 


50 


DIDN'T 


1,040 


58 


AROUND 


688 


63 


CAME 


498 


NOT 


4,117 


34 


WOULD 


1,899 


50 


HA 


1,034 


58 


WAIT 


688 


63 


TONIGHT 


495 


IT'S 


4,1 11 


34 


DID 


1,804 


50 


NEW 


1,023 


58 


STILL 


687 


63 


LEFT 


493 


KNOW 


3,962 


35 


TAKE 


1,794 


51 


TALK 


1,020 


59 


START 


684 


64 


TURN 


484 


NO 


3,890 


35 


WERE 


1,765 


51 


INTO 


1,012 


59 


LIVE 


680 


64 


DOESN'T 


483 


BUT 


3,885 


35 


MAKE 


1,757 


51 


WORK 


1,007 


59 


USE 


675 


64 


I'D 


482 


DON’T 


3,859 


36 


BACK 


1,739 


51 


PLAY 


1,006 


59 


SURE 


674 


64 


NEITHER 


481 


GET 


3,739 


36 


WHO 


1,719 


51 


TRY 


998 


59 


KEEP 


671 


64 


MUST 


476 


THEY 


3,612 


37 


BEEN 


1,707 


52 


* MUCH 


988.. 


59 


SIR 


670 


64 


KILL 


472 


LIKE 


3,436 


37 


HAS 


1,697 


52 


GUY 


,987 


59 


OLD 


667 


64 


HAND 


470 


SO 


3,425 


38 


THEM 


1,599 


52 


I'VE 


980 


59 


MAYBE 


657 


64 


STAY 


468 


JUST 


3,300 


38 


OR 


1,553 


52 


UH 


976 


60 


WE’LL 


653 


64 


WATCH 


467 


AT 


3,295 


38 


SOME 


1,547 


52 


MEAN 


954 


60 


THOUGHT 


652 


64 


YOU'VE 


467 


HERE 


3,197 


39 


MAN 


1,529 


53 


THERE'S 


954 


60 


BELIEVE 


650 


64 


CHILDREN 


465 


OUT 


3,117 


39 


VERY 


1,510 


53 


ONLY 


938 


60 


BOY 


64 6 


64 


HEAR 


463 


UP 


3,074 


40 


OUR 


1,475 


53 


GIVE 


924 


60 


THREE 


644 


64 


HOPE 


462 


ABOUT 


3,031 


40 


DOWN 


1,474 


53 


OFF 


920 


60 


EVERY 


641 


65 


MOTHER 


455 


ONE 


2,998 


40 


THING 


1,456 


53 


ANY 


917 


60 


CAPTION 


639 


65 


NICE 


455 


RIGHT 


2,906 


41 


WAY 


1,431 


53 


FEEL 


907 


60 


EVER 


639 


65 


REMEMBER 


454 


COME 


2,904 


41 


YEAR 


1,420 


54 


THESE 


905 


60 


SHOW 


636 


65 


OWN 


453 


THERE 


2,886 


41 


PEOPLE 


1,409 


54 


GREAT 


884 


60 


AWAY 


635 


65 


WON'T 


451 


OH 


2,781 


42 


COULD 


1,408 


54 


LET'S 


884 


61 


ALWAYS 


626 


65 


MORNING 


449 


CAN 


2,772 


42 


MORE 


1,383 


54 


PREPARE 


871 


61 


ANYTHING 


607 


65 


EVERYTHING 


446 




Cum. 

% 

65 

65 

65 

65 

65 

65 

65 

66 
66 
66 
66 
66 
66 
66 
66 
66 
66 
66 
66 
66 
66 
66 
67 
67 
67 
67 
67 
67 
67 
67 
67 
67 
67 
67 
67 
67 
67 
67 

67 

68 
68 
68 
68 
68 
68 
68 
68 
68 
68 
68 





H(! 2 ^ 3 



L/.S. Department of Education 

Office of Educational Research and Improvement (OERI) 
National Library of Education (NLE) 
Educational Resources Information Center (ERIC) 




REPRODUCTION RELEASE 

(Specific Document) 



I. DOCUMENT IDENTIFICATION: 



Title: P*£<z,£*jt*tioh Rnre REAbABitiry or Cc<rset> Q.ArTio*j£& Tit-ems/o'J — F/aj/u_ fepofir 


Author(s): Ce-Ru J~£*lSC.i-tA 




Corporate Source: 


Publication Date: 


jNsr/rure Foil Disabilities fes£/)/K H #*!> TRA/fUdG-, Tajc , 


6 /so 7 


II. REPRODUCTION RELEASE: 





In order to disseminate as widely as possible timely and significant materials of interest to the educational community, documents announced in the 
monthly abstract journal of the ERIC system, Resources in Education (RIE), are usually made available to users in microfiche, reproduced paper copy, 
and electronic media, and sold through the ERIC Document Reproduction Service (EDRS). Credit is given to the source of each document, and, If 
reproduction release is granted, one of the following notices is affixed to the document. 

If permission is granted to reproduce and disseminate the identified document, please CHECK ONE of the following three options and sign at the bottom 
of the page. 



The sample sticker shown below will be 
affixed to all Level 1 documents 


The sample sticker shown below will be 
affixed to all Level 2A documents 


The sample sticker shown below will be 
affixed to all Level 2B documents 


PERMISSION TO REPRODUCE AND 
DISSEMINATE THIS MATERIAL HAS 
BEEN GRANTED BY 




PERMISSION TO REPRODUCE AND 
DISSEMINATE THIS MATERIAL IN 
MICROFICHE, AND IN ELECTRONIC MEDIA 
FOR ERIC COLLECTION SUBSCRIBERS ONLY, 
HAS BEEN GRANTED BY 




PERMISSION TO REPRODUCE AND 
DISSEMINATE THIS MATERIAL IN 
MICROFICHE ONLY HAS BEEN GRANTED BY 


„<3P 




A® 




\e. 






■ G 




cf 


TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) 




TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) 




TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) 


1 




2A 




2B 



Level 1 

i 



Level 2A 

t 



Level 2B 

I 






Check here for Level 1 release, permitting reproduction 
and dissemination in microfiche or other ERIC archival 
media (e g .. electronic) and paper copy. 



Check here for Level 2A release, permitting reproduction 
and dissemination in microfiche and in electronic media 
for ERIC archival collection subscribers only 



Check here for Level 28 release, permitting 
reproduction and dissemination in microfiche only 



Documents will be processed as indicated provided reproduction quality permits. 

If permission to reproduce Is granted, but no box is checked, documents will be processed at Level 1 . 



Sign 

here,-* 

r'ease 




/ hereby grant to the Educational Resources Information Center (ERIC) nonexclusive permission to reproduce and disseminate this document 
as indicated above. Reproduction from the ERIC microfiche or electronic media by persons other than ERIC employees and its system 
contractors requires permission from the copyright holder. Exception is made for non-profit reproduction by libraries and other service agencies 
to satisfy information needs of educators in response to discrete inquiries. 


Signature: / j 


Printed Name/Position/Trtle. 

CnfiL- TEuSE/iA/ Fh.V, 


Organization/AddfSss: 

iuSTtruTS. F&A l>JS*a/LiTJes R£S£A-/t(M Mb 

X424- Uu n/ffiTt ry Bout£VA#2i west 

S/4 l/£<g T 2*90 2, • 


Telephone: 

f. 30/ 94 Z 432.L 


FM 3o, 2 


r E-Mail Address^ _ 

7ART& AOL.Qi>(A 


Da,e: £>//S/?X 



(over) 



